Push sort requirements through simple projections#23288
Conversation
|
@zhuqi-lucas might have a look at the PR |
|
@xudong963 @alamb It seems duplicated with this: |
I don't think so due to comment in mentioned PR. |
74089c8 to
0e75eea
Compare
|
@zhuqi-lucas thanks, rebased, updated missed tpch tests. |
|
This is nice - I think I found this earlier (but abandoned because it didn't move benchmarks). I think it is a good change, it should reduce sort / memory usage somewhat. |
|
run benchmarks |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing push_sorts_through_simple_projections (0e75eea) to 31198ba (merge-base) diff using: clickbench_partitioned File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing push_sorts_through_simple_projections (0e75eea) to 31198ba (merge-base) diff using: tpcds File an issue against this benchmark runner |
|
🤖 Benchmark running (GKE) | trigger CPU Details (lscpu)Comparing push_sorts_through_simple_projections (0e75eea) to 31198ba (merge-base) diff using: tpch File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpch — base (merge-base)
tpch — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usagetpcds — base (merge-base)
tpcds — branch
File an issue against this benchmark runner |
|
🤖 Benchmark completed (GKE) | trigger Instance: CPU Details (lscpu)Details
Resource Usageclickbench_partitioned — base (merge-base)
clickbench_partitioned — branch
File an issue against this benchmark runner |
| /// The implementation is intentionally conservative: computed projection | ||
| /// expressions and non-column sort expressions are not pushed down. Returning | ||
| /// `Ok(None)` leaves sorting above the projection, preserving correctness. | ||
| fn handle_projection_pushdown( |
There was a problem hiding this comment.
It seems remap_requirement_through_projection in #23199 does a similar thing — can we deduplicate them?
There was a problem hiding this comment.
Yeah, but it's more permissive. It filters out all the expressions that are not Column's. This one don't go if any expression is not Column.
I reused remap_requirement_through_projection inside the handle_projection_pushdown.
handle_projection_pushdown just adds additional check layer
0e75eea to
6bc7194
Compare
6bc7194 to
01fb7a2
Compare
| } else if let Some(aggregate_exec) = plan.downcast_ref::<AggregateExec>() { | ||
| handle_aggregate_pushdown(aggregate_exec, parent_required) | ||
| } else if let Some(projection_exec) = plan.downcast_ref::<ProjectionExec>() { | ||
| // A `ProjectionExec` with a fetch is handled in the fetch branch above |
There was a problem hiding this comment.
The comments here is not easy to understanding, we may polish it or move to other places.
A ProjectionExec with a fetch is handled in the fetch branch above
There was a problem hiding this comment.
It came from recently merged #23199. What do you suggest? I think about removing it at all
There was a problem hiding this comment.
Yeah, remove it — the "handled in fetch branch above" note was informative at the old short-circuit line (which this PR removes), but here it's noise. Move the "column-only" rationale into handle_projection_pushdown's doc comment.
| } | ||
|
|
||
| #[tokio::test] | ||
| async fn test_push_sort_through_reordered_projection_to_union() -> Result<()> { |
There was a problem hiding this comment.
Test coverage is currently thin (one Union scenario); a few negative /alias / fetch-interaction cases would help.
|
Thank you for opening this pull request! Reviewer note: cargo-semver-checks reported the current version number is not SemVer-compatible with the changes in this pull request (compared against the base branch). Details |
Allow EnforceSorting sort pushdown to remap ordering requirements through ProjectionExec when the required output columns are simple aliases or reordered input columns. Prior this patch sort pushdown stopped at `ProjectionExec`, even when the projection only changed column order. This could force a `SortExec` above the projection and therefore above larger subplans such as `UnionExec`, preventing branch-local sorts from being used. The new logic conservatively handles only column-to-column projections and leaves computed expressions untouched, preserving correctness while enabling better plans for reordered projection cases.
01fb7a2 to
014d165
Compare
Allow
EnforceSortingsort pushdown to remap ordering requirements throughProjectionExecwhen the required output columns are simple aliases or reordered input columns.Which issue does this PR close?
Prior this patch sort pushdown stopped at
ProjectionExec, even when the projection only changed column order. This could force aSortExecabove the projection and therefore above larger subplans such asUnionExec, preventing branch-local sorts from being used.The new logic conservatively handles only column-to-column projections and leaves computed expressions untouched, preserving correctness while enabling better plans for reordered projection cases.
Are these changes tested?
Yes:
test_push_sort_through_reordered_projection_to_unionadded.Are there any user-facing changes?
No