Skip to content

feat: support interval types and make_ym_interval / make_dt_interval#4541

Draft
andygrove wants to merge 1 commit into
apache:mainfrom
andygrove:interval-type-support
Draft

feat: support interval types and make_ym_interval / make_dt_interval#4541
andygrove wants to merge 1 commit into
apache:mainfrom
andygrove:interval-type-support

Conversation

@andygrove
Copy link
Copy Markdown
Member

Which issue does this PR close?

Part of #4540.

Rationale for this change

Comet had no support for Spark's interval data types, so every expression producing or consuming an interval forced a fallback to Spark, and the JVM codegen dispatcher (CometCodegenDispatch) could not cover them either because CometBatchKernelCodegen.isSupportedDataType rejected interval output. This PR adds the foundational type support and uses the existing codegen-dispatch mechanism to make the interval constructors run natively while matching Spark exactly.

What changes are included in this PR?

Type support for YearMonthIntervalType and DayTimeIntervalType:

  • native/proto/src/proto/types.proto: add YEAR_MONTH_INTERVAL (18) and DAY_TIME_INTERVAL (19) to DataTypeId.
  • native/core/src/execution/serde.rs: map them to Arrow Interval(YearMonth) and Duration(Microsecond).
  • Utils.toArrowType / fromArrowType: same mapping on the JVM side.
  • QueryPlanSerde.serializeDataType: emit the new type ids.
  • CometBatchKernelCodegen: accept both interval types in isSupportedDataType, resolve IntervalYearVector / DurationVector, and emit primitive set() writes.

Representation note: Spark stores DayTimeIntervalType as microseconds in an int64, which round-trips faithfully through Arrow Duration(Microsecond). Arrow Interval(DayTime) uses a {days, millis} layout that would lose microsecond precision, so it is intentionally not used.

Expressions (via CometCodegenDispatch, registered in temporalExpressions):

  • make_ym_interval -> CometMakeYMInterval
  • make_dt_interval -> CometMakeDTInterval

Out of scope (follow-ups tracked in #4540): CalendarIntervalType, and interval arithmetic / multiply_*_interval / extract of interval fields.

How are these changes tested?

SQL file tests under spark/src/test/resources/sql-tests/expressions/datetime/ (make_ym_interval.sql, make_dt_interval.sql) run by CometSqlFileTestSuite. The query mode uses checkSparkAnswerAndOperator, asserting both answer parity with Spark and native execution (a fallback fails the test). Cases cover column and literal inputs, default arguments, negatives, and nulls. The full SQL suite shows no new regressions.

…[skip ci]

Implements the type-support prerequisite from issue apache#4540: add Spark
YearMonthIntervalType and DayTimeIntervalType as physical types that round-trip
through Comet's Arrow FFI, and route the make_ym_interval / make_dt_interval
constructors through the JVM codegen dispatcher so they execute natively and
match Spark exactly.

Type plumbing:
- proto: add YEAR_MONTH_INTERVAL (18) and DAY_TIME_INTERVAL (19) to DataTypeId.
- native serde.rs: map them to Arrow Interval(YearMonth) and Duration(Microsecond)
  respectively. DayTime stores microseconds in an int64, which matches
  Duration(Microsecond) rather than the lossy Interval(DayTime) {days, millis}.
- Utils.toArrowType / fromArrowType: same mapping on the JVM side.
- QueryPlanSerde.serializeDataType: emit the new type ids.
- CometBatchKernelCodegen: accept the two interval types in isSupportedDataType
  and resolve IntervalYearVector / DurationVector; emit primitive set() writes.

Expressions:
- make_ym_interval -> CometMakeYMInterval, make_dt_interval -> CometMakeDTInterval,
  both via CometCodegenDispatch, registered in temporalExpressions.

CalendarIntervalType and interval arithmetic remain follow-ups under apache#4540.

Tests: SQL file tests for both constructors assert answer parity and native
execution (checkSparkAnswerAndOperator), covering column and literal inputs,
defaults, negatives, and nulls.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant