Conversation
fix fix [improvement](be) Add thrift definitions for BE-side runtime filter partition pruning Problem Summary: Add thrift structures needed for BE-side runtime filter partition pruning: partition_boundaries container in TOlapScanNode, TTargetExprMonotonicity enum, and target_expr_monotonicity field in TRuntimeFilterDesc. Also add detailed comments to existing TPartitionBoundary and partiton_to_tablets fields. Key design decisions reflected in this thrift change: - No is_default_partition field: FE omits partitions it does not want pruned - No is_range_partition field: partition type inferred from TPartitionBoundary field presence (range_start/range_end → Range, list_values → List) - Single TTargetExprMonotonicity enum (NON_MONOTONIC / MONOTONIC_INCREASING / MONOTONIC_DECREASING) instead of two separate booleans None - Test: No need to test (thrift definition only, no runtime behavior change) - Behavior changed: No - Does this need documentation: No Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
|
Thank you for your contribution to Apache Doris. Please clearly describe your PR:
|
There was a problem hiding this comment.
Pull request overview
This PR adds BE-side runtime-filter-based partition pruning for OLAP scans by having FE send partition boundary metadata to BE, then using runtime filter values to mark partitions (and associated tablets/scanners) as prunable during scan execution. It also introduces a dedicated regression test suite to validate pruning counters across range/list partitions and multiple data types.
Changes:
- Extend Thrift plan structures to carry partition boundary descriptors (and related RF metadata) from FE to BE.
- FE: populate partition boundary descriptors in
OlapScanNodethrift; annotate runtime filter targets with a monotonicity hint. - BE: add a reusable runtime-filter partition pruner, integrate it into scan local state and scanner scheduling, and expose profile counters for pruned/total partitions.
Reviewed changes
Copilot reviewed 16 out of 16 changed files in this pull request and generated 6 comments.
Show a summary per file
| File | Description |
|---|---|
| regression-test/suites/query_p0/runtime_filter/rf_partition_pruning.groovy | New regression tests asserting partition pruning via profile counters across partition types and expressions. |
| regression-test/data/query_p0/runtime_filter/rf_partition_pruning.out | Golden output for the new regression suite queries. |
| regression-test/conf/regression-conf.groovy | Changes default JDBC/HTTP ports used by regression framework configuration. |
| gensrc/thrift/PlanNodes.thrift | Adds TPartitionBoundary, adds scan-node fields for partition pruning, and adds TTargetExprMonotonicity + RF descriptor field. |
| fe/fe-core/src/main/java/org/apache/doris/planner/RuntimeFilter.java | Sets target_expr_monotonicity for eligible RF targets. |
| fe/fe-core/src/main/java/org/apache/doris/planner/OlapScanNode.java | Populates partition_boundaries on TOlapScanNode based on selected partitions. |
| be/src/exec/scan/scanner_scheduler.cpp | Stops scanner work early when its partition has been pruned after late RF processing. |
| be/src/exec/scan/scanner.h | Adds a virtual hook for scanners to report partition-pruned state. |
| be/src/exec/scan/olap_scanner.{h,cpp} | Implements partition-pruned checks for OLAP scanners. |
| be/src/exec/runtime_filter/runtime_filter_partition_pruner.{h,cpp} | New BE component to parse boundaries and prune partitions based on RF conjuncts. |
| be/src/exec/operator/scan_operator.{h,cpp} | Integrates pruner into scan local state and adds profile counters. |
| be/src/exec/operator/olap_scan_operator.{h,cpp} | Parses FE-provided boundaries and filters tablets whose partitions are pruned. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // Partition-id → tablet-id list mapping; used together with partition_boundaries | ||
| // for BE-side runtime filter partition pruning. | ||
| 26: optional map<Types.TPartitionId, list<Types.TTabletId>> partiton_to_tablets | ||
| // Partition boundary descriptors for BE-side runtime filter partition pruning. | ||
| // Only partitions that are candidates for pruning are included; partitions FE | ||
| // does not want pruned (e.g. default catch-all) are omitted from this list. | ||
| 27: optional list<TPartitionBoundary> partition_boundaries |
| // When the target expression is a plain SlotRef (identity), FE may omit this | ||
| // field; BE treats an absent value as NON_MONOTONIC (conservative). |
| for (RuntimeFilterTarget target : targets) { | ||
| if (target.expr instanceof SlotRef && target.node instanceof OlapScanNode) { | ||
| SlotRef slotRef = (SlotRef) target.expr; | ||
| Column col = slotRef.getColumn(); | ||
| if (col != null) { | ||
| OlapScanNode scanNode = (OlapScanNode) target.node; | ||
| OlapTable table = scanNode.getOlapTable(); | ||
| PartitionType partType = table.getPartitionInfo().getType(); | ||
| if (partType == PartitionType.RANGE || partType == PartitionType.LIST) { | ||
| for (Column partCol : table.getPartitionInfo().getPartitionColumns()) { | ||
| if (partCol.getName().equalsIgnoreCase(col.getName())) { | ||
| tFilter.setTargetExprMonotonicity( | ||
| TTargetExprMonotonicity.MONOTONIC_INCREASING); | ||
| break; | ||
| } | ||
| } | ||
| } | ||
| } | ||
| } | ||
| // Only check the first target; if there are multiple targets on | ||
| // different tables, monotonicity may differ. We conservatively | ||
| // set it based on the first target only. | ||
| break; | ||
| } |
| for (const auto& pb : boundaries_it->second) { | ||
| if (_pruned_partition_ids.contains(pb.partition_id) || | ||
| newly_pruned.contains(pb.partition_id)) { | ||
| continue; | ||
| } |
| jdbcUrl = "jdbc:mysql://127.0.0.1:9333/?useLocalSessionState=true&allowLoadLocalInfile=true&zeroDateTimeBehavior=round" | ||
| targetJdbcUrl = "jdbc:mysql://127.0.0.1:9333/?useLocalSessionState=true&allowLoadLocalInfile=true&zeroDateTimeBehavior=round" |
| feSyncerPassword = "" | ||
|
|
||
| feHttpAddress = "127.0.0.1:8030" | ||
| feHttpAddress = "127.0.0.1:8333" |
|
run buildall |
|
/review |
|
OpenCode automated review failed and did not complete. Error: Review step was skipped (possibly timeout or cancelled) Please inspect the workflow logs and rerun the review after the underlying issue is resolved. |
Cloud UT Coverage ReportIncrement line coverage Increment coverage report
|
|
/review |
BE Regression && UT Coverage ReportIncrement line coverage Increment coverage report
|
FE Regression Coverage ReportIncrement line coverage |
|
OpenCode automated review failed and did not complete. Error: Review step was failure (possibly timeout or cancelled) Please inspect the workflow logs and rerun the review after the underlying issue is resolved. |
What problem does this PR solve?
Issue Number: close #xxx
Related PR: #xxx
Problem Summary:
Release note
None
Check List (For Author)
Test
Behavior changed:
Does this need documentation?
Check List (For Reviewer who merge this PR)