Skip to content

Preserve recursive CTE nullability across logical and physical planning#22037

Open
kosiew wants to merge 25 commits into
apache:mainfrom
kosiew:nullability-mismatch-22034
Open

Preserve recursive CTE nullability across logical and physical planning#22037
kosiew wants to merge 25 commits into
apache:mainfrom
kosiew:nullability-mismatch-22034

Conversation

@kosiew
Copy link
Copy Markdown
Contributor

@kosiew kosiew commented May 6, 2026

Which issue does this PR close?

Rationale for this change

Recursive CTEs can widen column nullability between the anchor term and recursive term. Prior to this change, the recursive work table and recursive query output schema inherited the anchor term schema directly, which could incorrectly preserve non-nullability.

This caused downstream logical optimizations and physical schema checks to operate on overly strict schemas. In particular, nullability-based simplifications could incorrectly remove semantically required predicates in recursive queries.

This PR makes recursive CTE schema handling conservative for nullability while preserving all other schema dimensions exactly.

What changes are included in this PR?

  • Added internal recursive schema reconciliation helpers in datafusion_common::recursive_schema for:

    • widening schema nullability while preserving metadata and field properties
    • deriving recursive query output schemas
    • reconciling logical and physical schemas when only nullability differs
  • Updated recursive CTE planning to:

    • create nullable work table schemas
    • derive recursive query output schemas from both anchor and recursive terms
    • preserve anchor-term schema metadata, field names, and functional dependencies
  • Added plan_with_schema helper to rebuild plans with reconciled schemas

  • Updated RecursiveQuery reconstruction paths in:

    • logical plan rebuilding
    • protobuf deserialization
  • Relaxed aggregate physical schema validation for recursive queries when the only mismatch is widened nullability

  • Simplified recursive work table reference detection using LogicalPlan::exists

  • Updated explain output expectations to reflect explicit casts introduced by schema coercion

  • Added regression coverage for recursive CTE nullability handling

Are these changes tested?

Yes.

Added unit tests in datafusion/common/src/recursive_schema.rs covering:

  • metadata preservation when widening nullability
  • recursive output schema reconciliation
  • nullability-only schema reconciliation
  • rejection of unsupported schema mismatches

Added sqllogictest coverage in datafusion/sqllogictest/test_files/cte.slt for recursive CTE nullability behavior, including a regression test ensuring IS NOT NULL predicates are preserved correctly.

Updated existing explain and physical plan expectations in:

  • datafusion/core/tests/sql/explain_analyze.rs
  • datafusion/sqllogictest/test_files/cte.slt
  • datafusion/sqllogictest/test_files/explain_tree.slt

Are there any user-facing changes?

Yes.

Recursive CTEs now conservatively widen output nullability when recursive terms can produce nullable values. This fixes incorrect optimizer behavior and prevents physical planning failures caused by nullability mismatches in recursive queries.

LLM-generated code disclosure

This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.

kosiew added 5 commits May 6, 2026 11:49
- Added `align_plan_to_schema` and `SchemaAlignExec` for improved schema alignment in execution plans.
- Maintained strict behavior in `project_plan_to_schema` for projection-only cases.
- Updated adapter to handle nullability narrowing while preserving SQL behavior.
- Modified `RecursiveQueryExec` to preserve static/declared schema and aligned recursive term at plan construction.
- Removed nullability-widening schema synthesis for cleaner execution.
- Restored `0 AS` level in SQL logic test file `cte.slt`.
…ent behavior

- Added direct tests for align_plan_to_schema:
- Verified exact schema returns the same plan.
- Ensured rename-only uses ProjectionExec.
- Confirmed nullability narrowing uses SchemaAlignExec.
- Tested count/type/field metadata/schema metadata errors.
- Documented conservative property behavior in the adapter path.
- Refactored `align_plan_to_schema` function to store input schema in a variable, reducing redundant calls.
- Updated validation and comparison logic for better clarity and performance.
- Simplified partitioning handling in `SchemaAlignExec` by consolidating pattern matching.
- Enhanced `DisplayAs` implementation to correctly handle `TreeRender` format.
…odules

- Reuse `input_schema` in common.rs
- Simplify projected return using `debug_assert_eq!`
- Utilize `partition_count()` in common.rs
- Modify TreeRender to return `Ok(())`
- Reuse `static_schema` in tests for recursive_query.rs
- Removed redundant upfront align validation in common.rs.
- Added test helpers in common.rs:
- single_field_schema
- single_i32_exec
- metadata mismatch builders
- Shortened repeated test setup in common.rs.
- Added recursive_exec test helper in recursive_query.rs.
- Simplified RecursiveQueryExec::try_new(...) in recursive_query.rs.
@github-actions github-actions Bot added sqllogictest SQL Logic Tests (.slt) physical-plan Changes to the physical-plan crate labels May 6, 2026
@kosiew kosiew marked this pull request as ready for review May 6, 2026 06:17
@neilconway
Copy link
Copy Markdown
Contributor

neilconway commented May 9, 2026

Please let me know if I'm understanding this correctly:

  • The PR aims to address a situation where there is a schema mismatch between the anchor and recursive cases in a CTE
  • In particular, we might infer different nullability properties between the anchor vs the recursive query -- e.g., if we have 0 in the anchor and min(...) in the recursive case, 0 is non-nullable and min(...) is nullable (as an aside, the latter is conservative: min(x) without FILTER in a grouped query is non-nullable if x is non-nullable, but I suppose this is a separate planner shortcoming...)
  • The proposed behavior is to apply the anchor schema to the recursive CTE schema. So we would effectively be requiring that a nullable min expression never returns a NULL, in the example above
  • If the recursive query does return a NULL, we produce a runtime error

If that is accurate, then the proposed behavior would result in this query producing an error:

SET datafusion.execution.enable_recursive_ctes = true;

  WITH RECURSIVE t AS (
    SELECT 0 AS n
    UNION ALL
    SELECT CAST(NULL AS INT) AS n FROM t WHERE n IS NOT NULL
  )
  SELECT * FROM t;

(Column 'n' is declared as non-nullable but contains null values) -- but this query seems entirely reasonable to me and is allowed by other SQL implementations (e.g., Postgres, DuckDB, MariaDB, SQLite).

Instead, shouldn't we be computing the CTE's logical schema by widening the anchor and the recursive schemas? This is conceptually similar to what we do for UNION. That is, if the anchor expression is non-nullable and the recursive expression is nullable, the output schema should be nullable.

kosiew added 6 commits May 10, 2026 23:37
…and tests

- Added `schema: DFSchemaRef` to `RecursiveQuery`.
- Updated `LogicalPlan::RecursiveQuery.schema()` to return the stored schema.
- Introduced `RecursiveQuery::try_new(...)` for schema derivation based on static anchor field names, qualifiers, data types, nullability, and intersected metadata.
- Implemented manual `PartialOrd` for `RecursiveQuery`.
- Modified `to_recursive_query` to utilize `RecursiveQuery::try_new(...)`.
- Added unit test for widening nullability in recursive query schema.
- Ensured `RecursiveQuery` rebuilds correctly after child transforms using `try_new(...)`.
- Updated deserialization of `RecursiveQuery` to leverage `try_new(...)`.
- Enhanced `RecursiveQueryExec::try_new` to derive widened output schema using static and recursive schemas.
- Introduced a helper function for generating recursive query output schema.
- Updated tests for executive schema handling of recursive nullable outputs.
- Added a SQL regression test to verify recursive term behavior and expected output.
- Addressed issue with the work table being planned with anchor/static schema only.
- Modified logic to ensure that recursive term is planned once with anchor schema, preventing non-null optimizations that lead to infinite NULL emissions.
- Built initial recursive CTE schema and recreated work table if schema nullability widened.
- Replanned the recursive term using the widened work table schema to avoid inefficiencies.
- Added private `cte_work_table_plan` in `cte.rs`
- Removed duplicated work-table source/scan construction in `cte.rs`
- Simplified `recursive_query_schema` in `plan.rs`
- Removed unnecessary Result wrapping in field collection in `plan.rs`
- Used `Field::with_metadata` in `plan.rs`
- Updated stale comment and used `Field::with_metadata` in `recursive_query.rs`
- Updated `RecursiveQuery::try_new` to validate column count and data types.
- Added direct regression tests for logical plan.
- Enhanced physical recursive schema to intersect field/schema metadata like logical schema.
- Implemented metadata regression test in physical plan.
- Improved `align_plan_to_schema` to align metadata via `SchemaAlignExec`.
- Maintained behavior in `project_plan_to_schema` to reject metadata changes.
- Added comment for projection-error fallback in common code.
- Clarified comments regarding two-pass recursive planning in SQL component.
- Updated RecursiveQueryExec to accept declared logical recursive CTE schema.
- Removed physical recursive schema recomputation, using logical schema as source of truth.
- Aligned children to declared schema.
- Introduced private recursive-CTE-local schema rebind exec for metadata/name/schema-only fixes.
- Eliminated broad global align_plan_to_schema and SchemaAlignExec, retaining narrower project_plan_to_schema.
- Renamed helper function from `align_recursive_plan_to_schema` to `align_recursive_child_to_logical_schema`.
- Updated fallback mechanism to preserve `project_plan_to_schema` errors when local rebind cannot handle cases safely.
- `RecursiveSchemaRebindExec` now rejects:
- Schema metadata mismatches
- Field metadata mismatches
- Column count mismatches
- Type mismatches
- Maintained support for nullability-only schema rebind.
- Updated tests to include:
- Nullability rebind test
- Field metadata rejection test
- Schema metadata rejection test
@kosiew
Copy link
Copy Markdown
Contributor Author

kosiew commented May 11, 2026

@neilconway
Thanks for this test case:

SET datafusion.execution.enable_recursive_ctes = true;

  WITH RECURSIVE t AS (
    SELECT 0 AS n
    UNION ALL
    SELECT CAST(NULL AS INT) AS n FROM t WHERE n IS NOT NULL
  )
  SELECT * FROM t;

and this suggestion

shouldn't we be computing the CTE's logical schema by widening the anchor and the recursive schemas? This is conceptually similar to what we do for UNION. That is, if the anchor expression is non-nullable and the recursive expression is nullable, the output schema should be nullable.

@kosiew kosiew changed the title Preserve recursive CTE static schema with plan-time schema alignment Widen recursive CTE logical schema nullability May 11, 2026
@github-actions github-actions Bot added sql SQL Planner logical-expr Logical plan and expressions core Core DataFusion crate proto Related to proto crate auto detected api change Auto detected API change labels May 11, 2026
kosiew added 4 commits May 11, 2026 10:17
- Updated function signature in recursive_query to take a reference
- Updated internal call site in with_new_children to accommodate the change
- Modified test helper and all affected test call sites in recursive_query
- Updated planner call site in physical_planner to align with new function signature
…ionale

- Added two comments in `plan.rs` to clarify the name-preservation invariant and nullability-widening rationale at the construction site.
- Updated documentation in `recursive_query.rs` to note that `output_schema` is pre-widened, ensuring safe direction for recursive CTEs.
- Introduced a new query in `cte.slt` to test distinct column aliases, reinforcing the invariant that the CTE's exposed column name comes from the anchor term.
@github-actions github-actions Bot removed core Core DataFusion crate auto detected api change Auto detected API change labels May 11, 2026
Copy link
Copy Markdown
Contributor

@neilconway neilconway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This query hangs now:

  WITH RECURSIVE t(a, b) AS (
    SELECT 0 AS a, 0 AS b
    UNION ALL
    SELECT b AS a, CAST(NULL AS INT) AS b FROM t WHERE a IS NOT NULL
  )
  SELECT * FROM t;

because we incorrectly conclude that t.a is non-nullable, and so the optimizer elides the filter.

It seems a 2-pass approach isn't sufficient -- one approach would be to keep replanning until we reach a fixed point (which we should).

Alternatively, we could do something simpler: nullability analysis in DF is already quite conservative, and computing precise nullability for CTEs is not the low-hanging fruit if we want to make it more precise. Is it really that big of a loss if we mark CTE columns as nullable?

@kosiew kosiew marked this pull request as draft May 13, 2026 09:18
kosiew added 10 commits May 13, 2026 21:32
- Added SLT repro in datafusion/sqllogictest/test_files/cte.slt
- Fixed recursive CTE work-table nullability:
- Work table schema is now conservatively nullable
- RecursiveQuery now stores output schema
- Schema nullability considers static OR recursive term
- Proto deserialize now rebuilds via builder
- Updated affected EXPLAIN expectations
- Removed public RecursiveQuery.schema
- Restored original public struct shape
- Kept nullability handling internal:
- Recursive builder coerces terms to conservative nullable schema via existing projection schema override
- Optimizer child rewrites rebuild recursive query via builder
- Aggregate planner reconciles nullability only for recursive-query inputs
- Updated affected SLT explain output
…ly nullability widening

- Only reconciles nullability widening; rejects mismatches in count, name, type, and field/schema metadata.
- Removes zip truncation masking.
- Renamed function contains_recursive_query_input for clarity.
- Added comment to clarify aggregate recursive CTE special-case.
- Updated plan_with_schema to use input schema expressions instead of target schema columns.
- Introduced focused unit tests for validating allowed/rejected reconciliation cases.
- Adjusted SLT explain to align with the new safer projection logic.
- Added `datafusion/common/src/recursive_schema.rs` with the following functions:
- `make_schema_nullable`
- `recursive_query_output_schema`
- `reconcile_dfschema_with_schema_nullability`
- Tests for nullability widening and reject mismatches.

- Integrated the new schema helpers into existing components:
- Updated `sql/src/cte.rs` to use the common nullable work-table schema helper.
- Updated `expr/src/logical_plan/builder.rs` to use the common recursive output schema helper.
- Updated `core/src/physical_planner.rs` to use the common physical/logical reconciliation helper.

- Removed duplicated local helpers and tests from `core`, `expr`, and `sql`.

Semver:
- No breaking field/signature changes; added a doc-hidden helper module only.
- Changed projection expressions in the `parquet_recursive_projection_pushdown` test to use `CAST` for consistency and improved type safety.
- Refactored TreeNode::exists in physical planner and CTE modules
- Removed redundant recursive CTE re-coercion in logical plan builder
- Inlined small one-use variables in recursive schema module
@github-actions github-actions Bot added core Core DataFusion crate common Related to common crate and removed physical-plan Changes to the physical-plan crate labels May 13, 2026
@kosiew
Copy link
Copy Markdown
Contributor Author

kosiew commented May 13, 2026

@neilconway

I implemented mark CTE columns as nullable and added slt for

  WITH RECURSIVE t(a, b) AS (
    SELECT 0 AS a, 0 AS b
    UNION ALL
    SELECT b AS a, CAST(NULL AS INT) AS b FROM t WHERE a IS NOT NULL
  )
  SELECT * FROM t;

@kosiew kosiew changed the title Widen recursive CTE logical schema nullability Preserve recursive CTE nullability across logical and physical planning May 13, 2026
@kosiew kosiew marked this pull request as ready for review May 13, 2026 13:55
@kosiew
Copy link
Copy Markdown
Contributor Author

kosiew commented May 14, 2026

run benchmark sql_planner

@adriangbot
Copy link
Copy Markdown

🤖 Criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4448157662-72-fnj6f 6.12.68+ #1 SMP Wed Apr 1 02:23:28 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing nullability-mismatch-22034 (aa7fb40) to 937dfda (merge-base) diff
BENCH_NAME=sql_planner
BENCH_COMMAND=cargo bench --features=parquet --bench sql_planner
BENCH_FILTER=
Results will be posted here when complete


File an issue against this benchmark runner

@kosiew
Copy link
Copy Markdown
Contributor Author

kosiew commented May 14, 2026

show benchmark queue

@adriangbot
Copy link
Copy Markdown

Hi @kosiew, you asked to view the benchmark queue (#22037 (comment)).

Comment Repo PR User Benchmarks Status
#4448041895 apache/datafusion #20381 adriangb ["clickbench_partitioned"] running
#4448041895 apache/datafusion #20381 adriangb ["tpcds"] running
#4448145440 apache/arrow-rs #9972 adriangb ["arrow_writer"] running
#4448157662 apache/datafusion #22037 kosiew ["sql_planner"] running

File an issue against this benchmark runner

@adriangbot
Copy link
Copy Markdown

🤖 Criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)
Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected
Details

group                                                 main                                   nullability-mismatch-22034
-----                                                 ----                                   --------------------------
logical_aggregate_with_join                           1.00    416.0±1.43µs        ? ?/sec    1.01    418.8±2.83µs        ? ?/sec
logical_plan_struct_join_agg_sort                     1.00    160.8±1.66µs        ? ?/sec    1.04    167.2±0.78µs        ? ?/sec
logical_select_all_from_1000                          1.03      8.3±0.03ms        ? ?/sec    1.00      8.0±0.02ms        ? ?/sec
logical_select_one_from_700                           1.01    275.4±1.57µs        ? ?/sec    1.00    273.8±1.53µs        ? ?/sec
logical_trivial_join_high_numbered_columns            1.01    257.8±0.55µs        ? ?/sec    1.00    254.1±2.79µs        ? ?/sec
logical_trivial_join_low_numbered_columns             1.02    244.8±0.58µs        ? ?/sec    1.00    240.3±0.55µs        ? ?/sec
physical_intersection                                 1.00    587.5±2.18µs        ? ?/sec    1.01    592.9±1.63µs        ? ?/sec
physical_join_consider_sort                           1.00   1017.7±3.66µs        ? ?/sec    1.01   1030.2±3.45µs        ? ?/sec
physical_join_distinct                                1.02    239.4±4.44µs        ? ?/sec    1.00    235.5±0.52µs        ? ?/sec
physical_many_self_joins                              1.01      7.7±0.01ms        ? ?/sec    1.00      7.7±0.02ms        ? ?/sec
physical_plan_clickbench_all                          1.00    128.6±0.96ms        ? ?/sec    1.00    128.2±2.24ms        ? ?/sec
physical_plan_clickbench_q1                           1.00   1343.8±5.00µs        ? ?/sec    1.03   1378.6±5.71µs        ? ?/sec
physical_plan_clickbench_q10                          1.01      2.1±0.01ms        ? ?/sec    1.00      2.0±0.00ms        ? ?/sec
physical_plan_clickbench_q11                          1.01      2.2±0.04ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
physical_plan_clickbench_q12                          1.00      2.3±0.01ms        ? ?/sec    1.01      2.3±0.01ms        ? ?/sec
physical_plan_clickbench_q13                          1.00      2.0±0.04ms        ? ?/sec    1.00      2.1±0.01ms        ? ?/sec
physical_plan_clickbench_q14                          1.00      2.2±0.00ms        ? ?/sec    1.01      2.2±0.01ms        ? ?/sec
physical_plan_clickbench_q15                          1.00      2.1±0.00ms        ? ?/sec    1.01      2.1±0.01ms        ? ?/sec
physical_plan_clickbench_q16                          1.00  1729.1±17.91µs        ? ?/sec    1.01   1749.9±5.06µs        ? ?/sec
physical_plan_clickbench_q17                          1.00   1779.9±6.55µs        ? ?/sec    1.01  1796.7±12.88µs        ? ?/sec
physical_plan_clickbench_q18                          1.00   1604.9±9.66µs        ? ?/sec    1.00   1612.3±5.24µs        ? ?/sec
physical_plan_clickbench_q19                          1.00   1984.2±6.15µs        ? ?/sec    1.00   1983.2±5.99µs        ? ?/sec
physical_plan_clickbench_q2                           1.00   1755.5±5.11µs        ? ?/sec    1.03  1800.8±34.04µs        ? ?/sec
physical_plan_clickbench_q20                          1.01  1521.8±17.10µs        ? ?/sec    1.00   1505.9±5.57µs        ? ?/sec
physical_plan_clickbench_q21                          1.01  1767.9±19.29µs        ? ?/sec    1.00   1747.0±4.73µs        ? ?/sec
physical_plan_clickbench_q22                          1.00      2.2±0.03ms        ? ?/sec    1.00      2.2±0.01ms        ? ?/sec
physical_plan_clickbench_q23                          1.02      2.4±0.01ms        ? ?/sec    1.00      2.4±0.01ms        ? ?/sec
physical_plan_clickbench_q24                          1.01      5.8±0.02ms        ? ?/sec    1.00      5.7±0.01ms        ? ?/sec
physical_plan_clickbench_q25                          1.03  1953.3±13.34µs        ? ?/sec    1.00   1899.7±5.70µs        ? ?/sec
physical_plan_clickbench_q26                          1.02   1747.4±9.38µs        ? ?/sec    1.00   1720.8±6.65µs        ? ?/sec
physical_plan_clickbench_q27                          1.02   1966.2±5.84µs        ? ?/sec    1.00   1922.6±8.51µs        ? ?/sec
physical_plan_clickbench_q28                          1.00      2.4±0.01ms        ? ?/sec    1.01      2.4±0.01ms        ? ?/sec
physical_plan_clickbench_q29                          1.00      2.6±0.01ms        ? ?/sec    1.02      2.6±0.01ms        ? ?/sec
physical_plan_clickbench_q3                           1.00   1614.5±7.03µs        ? ?/sec    1.02   1646.0±5.18µs        ? ?/sec
physical_plan_clickbench_q30                          1.00     16.2±0.09ms        ? ?/sec    1.00     16.3±0.03ms        ? ?/sec
physical_plan_clickbench_q31                          1.00      2.4±0.00ms        ? ?/sec    1.02      2.5±0.01ms        ? ?/sec
physical_plan_clickbench_q32                          1.00      2.4±0.02ms        ? ?/sec    1.02      2.5±0.00ms        ? ?/sec
physical_plan_clickbench_q33                          1.00   1982.0±6.42µs        ? ?/sec    1.02      2.0±0.01ms        ? ?/sec
physical_plan_clickbench_q34                          1.00   1728.4±4.79µs        ? ?/sec    1.01   1753.5±4.45µs        ? ?/sec
physical_plan_clickbench_q35                          1.00  1796.6±25.27µs        ? ?/sec    1.02  1834.6±14.12µs        ? ?/sec
physical_plan_clickbench_q36                          1.00      2.1±0.01ms        ? ?/sec    1.01      2.1±0.01ms        ? ?/sec
physical_plan_clickbench_q37                          1.00      2.6±0.03ms        ? ?/sec    1.01      2.6±0.01ms        ? ?/sec
physical_plan_clickbench_q38                          1.00      2.6±0.00ms        ? ?/sec    1.01      2.6±0.01ms        ? ?/sec
physical_plan_clickbench_q39                          1.00      2.7±0.04ms        ? ?/sec    1.00      2.7±0.00ms        ? ?/sec
physical_plan_clickbench_q4                           1.00   1417.0±5.96µs        ? ?/sec    1.02   1442.7±6.32µs        ? ?/sec
physical_plan_clickbench_q40                          1.00      3.5±0.06ms        ? ?/sec    1.00      3.5±0.01ms        ? ?/sec
physical_plan_clickbench_q41                          1.00      3.0±0.01ms        ? ?/sec    1.00      2.9±0.01ms        ? ?/sec
physical_plan_clickbench_q42                          1.00      3.1±0.01ms        ? ?/sec    1.00      3.2±0.01ms        ? ?/sec
physical_plan_clickbench_q43                          1.01      3.3±0.08ms        ? ?/sec    1.00      3.2±0.01ms        ? ?/sec
physical_plan_clickbench_q44                          1.00   1527.5±6.71µs        ? ?/sec    1.00  1529.8±22.03µs        ? ?/sec
physical_plan_clickbench_q45                          1.00   1531.1±4.74µs        ? ?/sec    1.01   1541.5±4.15µs        ? ?/sec
physical_plan_clickbench_q46                          1.01  1873.0±14.69µs        ? ?/sec    1.00   1857.3±4.74µs        ? ?/sec
physical_plan_clickbench_q47                          1.00      2.6±0.00ms        ? ?/sec    1.00      2.6±0.04ms        ? ?/sec
physical_plan_clickbench_q48                          1.01      2.9±0.03ms        ? ?/sec    1.00      2.9±0.01ms        ? ?/sec
physical_plan_clickbench_q49                          1.01      2.9±0.02ms        ? ?/sec    1.00      2.9±0.01ms        ? ?/sec
physical_plan_clickbench_q5                           1.00   1561.8±5.87µs        ? ?/sec    1.01   1582.5±4.51µs        ? ?/sec
physical_plan_clickbench_q50                          1.01      2.9±0.08ms        ? ?/sec    1.00      2.9±0.02ms        ? ?/sec
physical_plan_clickbench_q51                          1.02      2.0±0.01ms        ? ?/sec    1.00  1972.8±12.96µs        ? ?/sec
physical_plan_clickbench_q6                           1.03   1604.0±6.31µs        ? ?/sec    1.00   1550.3±5.09µs        ? ?/sec
physical_plan_clickbench_q7                           1.03   1636.5±6.31µs        ? ?/sec    1.00   1585.1±5.68µs        ? ?/sec
physical_plan_clickbench_q8                           1.04   1950.4±5.04µs        ? ?/sec    1.00   1880.3±6.35µs        ? ?/sec
physical_plan_clickbench_q9                           1.00   1920.7±5.31µs        ? ?/sec    1.00   1915.5±8.98µs        ? ?/sec
physical_plan_struct_join_agg_sort                    1.00   1383.6±2.62µs        ? ?/sec    1.00   1390.1±2.44µs        ? ?/sec
physical_plan_tpcds_all                               1.00    722.1±3.05ms        ? ?/sec    1.02    734.0±3.46ms        ? ?/sec
physical_plan_tpch_all                                1.00     47.5±0.26ms        ? ?/sec    1.02     48.6±0.05ms        ? ?/sec
physical_plan_tpch_q1                                 1.00   1564.1±3.08µs        ? ?/sec    1.02   1592.9±2.44µs        ? ?/sec
physical_plan_tpch_q10                                1.00      3.0±0.01ms        ? ?/sec    1.01      3.0±0.02ms        ? ?/sec
physical_plan_tpch_q11                                1.00      2.2±0.01ms        ? ?/sec    1.01      2.2±0.01ms        ? ?/sec
physical_plan_tpch_q12                                1.00   1312.3±4.03µs        ? ?/sec    1.01   1319.3±4.11µs        ? ?/sec
physical_plan_tpch_q13                                1.00    997.1±5.86µs        ? ?/sec    1.00   1001.2±4.42µs        ? ?/sec
physical_plan_tpch_q14                                1.00   1424.6±3.56µs        ? ?/sec    1.02   1455.1±3.28µs        ? ?/sec
physical_plan_tpch_q16                                1.00   1662.6±3.84µs        ? ?/sec    1.00   1670.4±7.27µs        ? ?/sec
physical_plan_tpch_q17                                1.00   1781.8±5.22µs        ? ?/sec    1.02  1810.8±10.18µs        ? ?/sec
physical_plan_tpch_q18                                1.00   1887.9±5.00µs        ? ?/sec    1.01   1908.0±8.93µs        ? ?/sec
physical_plan_tpch_q19                                1.00      2.5±0.01ms        ? ?/sec    1.03      2.5±0.04ms        ? ?/sec
physical_plan_tpch_q2                                 1.00      4.3±0.00ms        ? ?/sec    1.01      4.4±0.00ms        ? ?/sec
physical_plan_tpch_q20                                1.00      2.4±0.00ms        ? ?/sec    1.01      2.4±0.02ms        ? ?/sec
physical_plan_tpch_q21                                1.00      3.1±0.01ms        ? ?/sec    1.00      3.2±0.00ms        ? ?/sec
physical_plan_tpch_q22                                1.00   1532.3±2.73µs        ? ?/sec    1.01   1545.3±2.12µs        ? ?/sec
physical_plan_tpch_q3                                 1.00      2.0±0.01ms        ? ?/sec    1.02      2.0±0.00ms        ? ?/sec
physical_plan_tpch_q4                                 1.00   1171.6±3.30µs        ? ?/sec    1.01   1184.0±2.00µs        ? ?/sec
physical_plan_tpch_q5                                 1.00      2.6±0.02ms        ? ?/sec    1.02      2.7±0.04ms        ? ?/sec
physical_plan_tpch_q6                                 1.00    658.8±2.10µs        ? ?/sec    1.03    680.3±1.98µs        ? ?/sec
physical_plan_tpch_q7                                 1.00      3.2±0.03ms        ? ?/sec    1.01      3.2±0.02ms        ? ?/sec
physical_plan_tpch_q8                                 1.00      4.2±0.01ms        ? ?/sec    1.01      4.2±0.05ms        ? ?/sec
physical_plan_tpch_q9                                 1.00      2.9±0.00ms        ? ?/sec    1.01      2.9±0.00ms        ? ?/sec
physical_select_aggregates_from_200                   1.00     14.7±0.02ms        ? ?/sec    1.00     14.7±0.04ms        ? ?/sec
physical_select_all_from_1000                         1.02     18.0±0.05ms        ? ?/sec    1.00     17.5±0.04ms        ? ?/sec
physical_select_one_from_700                          1.01    722.8±1.74µs        ? ?/sec    1.00    718.4±2.55µs        ? ?/sec
physical_sorted_union_order_by_10_int64               1.00      4.8±0.01ms        ? ?/sec    1.00      4.8±0.01ms        ? ?/sec
physical_sorted_union_order_by_10_uint64              1.00     11.6±0.01ms        ? ?/sec    1.00     11.6±0.03ms        ? ?/sec
physical_sorted_union_order_by_50_int64               1.00    113.1±0.27ms        ? ?/sec    1.00    113.4±0.42ms        ? ?/sec
physical_sorted_union_order_by_50_uint64              1.00    603.3±2.67ms        ? ?/sec    1.01    607.3±2.79ms        ? ?/sec
physical_theta_join_consider_sort                     1.00   1052.7±3.43µs        ? ?/sec    1.01   1064.5±2.64µs        ? ?/sec
physical_unnest_to_join                               1.00    616.0±1.68µs        ? ?/sec    1.02    627.3±1.70µs        ? ?/sec
physical_window_function_partition_by_12_on_values    1.00    742.5±2.35µs        ? ?/sec    1.02    755.0±2.67µs        ? ?/sec
physical_window_function_partition_by_30_on_values    1.00   1490.4±3.60µs        ? ?/sec    1.01   1499.8±2.64µs        ? ?/sec
physical_window_function_partition_by_4_on_values     1.00    439.8±1.22µs        ? ?/sec    1.05    460.4±1.28µs        ? ?/sec
physical_window_function_partition_by_7_on_values     1.00    551.4±2.09µs        ? ?/sec    1.03    566.7±1.19µs        ? ?/sec
physical_window_function_partition_by_8_on_values     1.00    593.8±1.91µs        ? ?/sec    1.02    606.9±1.39µs        ? ?/sec
with_param_values_many_columns                        1.01    461.3±2.71µs        ? ?/sec    1.00    456.2±2.10µs        ? ?/sec

Resource Usage

base (merge-base)

Metric Value
Wall time 1240.3s
Peak memory 19.7 GiB
Avg memory 19.7 GiB
CPU user 1479.6s
CPU sys 1.6s
Peak spill 0 B

branch

Metric Value
Wall time 1245.3s
Peak memory 19.8 GiB
Avg memory 19.7 GiB
CPU user 1485.2s
CPU sys 1.1s
Peak spill 0 B

File an issue against this benchmark runner

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

common Related to common crate core Core DataFusion crate logical-expr Logical plan and expressions proto Related to proto crate sql SQL Planner sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Recursive CTE Nullability Handling Should Preserve Logical Schema Without Requiring SQL Rewrites

3 participants