Skip to content

Preserve recursive CTE static schema with plan-time schema alignment#22037

Open
kosiew wants to merge 5 commits intoapache:mainfrom
kosiew:nullability-mismatch-22034
Open

Preserve recursive CTE static schema with plan-time schema alignment#22037
kosiew wants to merge 5 commits intoapache:mainfrom
kosiew:nullability-mismatch-22034

Conversation

@kosiew
Copy link
Copy Markdown
Contributor

@kosiew kosiew commented May 6, 2026

Which issue does this PR close?

Rationale for this change

RecursiveQueryExec widened recursive CTE output nullability by reconciling the static and recursive term schemas. This caused the physical schema to diverge from the logical/static CTE schema and forced valid SQL such as 0 AS level to be rewritten as nullable expressions like SUM(0) AS level.

This change preserves the declared recursive CTE schema by treating the static/anchor term schema as authoritative and aligning the recursive term to that schema during plan construction.

What changes are included in this PR?

  • Added align_plan_to_schema, a higher-level plan-time schema alignment helper that guarantees the resulting plan advertises the expected schema exactly.

  • Kept project_plan_to_schema as the narrower projection-based helper and refactored shared validation into validate_schema_alignment.

  • Added SchemaAlignExec, an execution-plan adapter that:

    • advertises the expected schema from plan properties
    • preserves positional column values
    • rebinds emitted RecordBatch schemas inside the adapter
    • validates column count, data types, field metadata, and schema metadata
  • Updated RecursiveQueryExec::try_new to:

    • use the static term schema as the recursive CTE output schema
    • align the recursive term with align_plan_to_schema
    • remove recursive output schema widening logic
  • Restored the recursive CTE SLT coverage from SUM(0) AS level back to 0 AS level.

Are these changes tested?

Yes.

Added and updated tests covering:

  • align_plan_to_schema:

    • exact schema returns unchanged plan
    • rename-only alignment uses ProjectionExec
    • nullable input to non-null expected schema uses SchemaAlignExec
    • column count mismatch errors
    • type mismatch errors
    • field metadata mismatch errors
    • schema metadata mismatch errors
  • project_plan_to_schema:

    • schema match passthrough
    • nullability widening
    • nullability narrowing rejection
    • metadata mismatch validation
  • RecursiveQueryExec:

    • recursive term projection alignment
    • preservation of the static nullability contract
    • recursive term schema matches the static schema after construction
  • Restored SQL logic test coverage in cte.slt using 0 AS level.

Validated with:

cargo test -p datafusion-physical-plan recursive_query_exec
cargo test -p datafusion-physical-plan project_plan_to_schema
cargo test -p datafusion-sqllogictest --test sqllogictests -- cte

Are there any user-facing changes?

Yes.

Recursive CTEs now preserve the declared/static schema instead of widening nullability based on recursive expressions. Existing valid SQL such as:

0 AS level

continues to work without requiring nullable rewrites like:

SUM(0) AS level

LLM-generated code disclosure

This PR includes LLM-generated code and comments. All LLM-generated content has been manually reviewed and tested.

kosiew added 5 commits May 6, 2026 11:49
- Added `align_plan_to_schema` and `SchemaAlignExec` for improved schema alignment in execution plans.
- Maintained strict behavior in `project_plan_to_schema` for projection-only cases.
- Updated adapter to handle nullability narrowing while preserving SQL behavior.
- Modified `RecursiveQueryExec` to preserve static/declared schema and aligned recursive term at plan construction.
- Removed nullability-widening schema synthesis for cleaner execution.
- Restored `0 AS` level in SQL logic test file `cte.slt`.
…ent behavior

- Added direct tests for align_plan_to_schema:
- Verified exact schema returns the same plan.
- Ensured rename-only uses ProjectionExec.
- Confirmed nullability narrowing uses SchemaAlignExec.
- Tested count/type/field metadata/schema metadata errors.
- Documented conservative property behavior in the adapter path.
- Refactored `align_plan_to_schema` function to store input schema in a variable, reducing redundant calls.
- Updated validation and comparison logic for better clarity and performance.
- Simplified partitioning handling in `SchemaAlignExec` by consolidating pattern matching.
- Enhanced `DisplayAs` implementation to correctly handle `TreeRender` format.
…odules

- Reuse `input_schema` in common.rs
- Simplify projected return using `debug_assert_eq!`
- Utilize `partition_count()` in common.rs
- Modify TreeRender to return `Ok(())`
- Reuse `static_schema` in tests for recursive_query.rs
- Removed redundant upfront align validation in common.rs.
- Added test helpers in common.rs:
- single_field_schema
- single_i32_exec
- metadata mismatch builders
- Shortened repeated test setup in common.rs.
- Added recursive_exec test helper in recursive_query.rs.
- Simplified RecursiveQueryExec::try_new(...) in recursive_query.rs.
@github-actions github-actions Bot added sqllogictest SQL Logic Tests (.slt) physical-plan Changes to the physical-plan crate labels May 6, 2026
@kosiew kosiew marked this pull request as ready for review May 6, 2026 06:17
@neilconway
Copy link
Copy Markdown
Contributor

neilconway commented May 9, 2026

Please let me know if I'm understanding this correctly:

  • The PR aims to address a situation where there is a schema mismatch between the anchor and recursive cases in a CTE
  • In particular, we might infer different nullability properties between the anchor vs the recursive query -- e.g., if we have 0 in the anchor and min(...) in the recursive case, 0 is non-nullable and min(...) is nullable (as an aside, the latter is conservative: min(x) without FILTER in a grouped query is non-nullable if x is non-nullable, but I suppose this is a separate planner shortcoming...)
  • The proposed behavior is to apply the anchor schema to the recursive CTE schema. So we would effectively be requiring that a nullable min expression never returns a NULL, in the example above
  • If the recursive query does return a NULL, we produce a runtime error

If that is accurate, then the proposed behavior would result in this query producing an error:

SET datafusion.execution.enable_recursive_ctes = true;

  WITH RECURSIVE t AS (
    SELECT 0 AS n
    UNION ALL
    SELECT CAST(NULL AS INT) AS n FROM t WHERE n IS NOT NULL
  )
  SELECT * FROM t;

(Column 'n' is declared as non-nullable but contains null values) -- but this query seems entirely reasonable to me and is allowed by other SQL implementations (e.g., Postgres, DuckDB, MariaDB, SQLite).

Instead, shouldn't we be computing the CTE's logical schema by widening the anchor and the recursive schemas? This is conceptually similar to what we do for UNION. That is, if the anchor expression is non-nullable and the recursive expression is nullable, the output schema should be nullable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

physical-plan Changes to the physical-plan crate sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Preserve recursive CTE declared schema when aligning physical children

2 participants