feat(extensions): support deprecation info in extensions#846
feat(extensions): support deprecation info in extensions#846nielspardon wants to merge 3 commits into
Conversation
Substrait v0.86.0 added the ability to mark simple-extension entries as deprecated (substrait-io/substrait#1014). An optional `deprecated` object (`since`, optional `reason`, optional `metadata`) can appear on types, scalar/aggregate/window functions, and individual function implementations. Parse and expose this information so tooling can surface deprecation warnings, mirroring the existing `metadata` support. Function-level deprecation propagates to each resolved variant; an impl-level deprecation takes precedence over the parent function's when both are present. Type variations are not modeled in substrait-java's extension schema, so type-variation deprecation is out of scope. Closes substrait-io#811 Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
vbarua
left a comment
There was a problem hiding this comment.
Left some minor comments but overall seems reasonable.
|
|
||
| /** | ||
| * Deprecation information for an extension entry (type, function, or function implementation). | ||
| * See <a href="https://github.com/substrait-io/substrait/pull/1014">substrait#1014</a>. |
There was a problem hiding this comment.
I don't think it's worth linking to the PR where this was introduced. The Javadoc should stand on it's own, and it already does.
There was a problem hiding this comment.
I removed that PR link from the comments. It suspect the AI wanted to follow an existing practice since this file contains another PR reference:
substrait-java/core/src/main/java/io/substrait/extension/SimpleExtension.java
Lines 180 to 182 in 594a168
| String name, | ||
| String description, | ||
| Optional<Map<String, Object>> metadata, | ||
| Optional<DeprecationStatus> deprecated) { |
There was a problem hiding this comment.
As this is a public API, we should note the breaking change in the description.
There was a problem hiding this comment.
I'm not even sure why this specific method is public while AggregateFunctionVariant.resolve() and WindowFunctionVariant are package private.
There was a problem hiding this comment.
I'm tempted to change the visibility of that method to be in line with the others methods and mark the change of visibility as a breaking change. What do you think?
There was a problem hiding this comment.
Now that you point it out, it is weird that this is public. Hiding this makes sense to me.
| .args(args()) | ||
| .options(options()) | ||
| .metadata(metadata) | ||
| .deprecated(deprecated().isPresent() ? deprecated() : deprecated) |
There was a problem hiding this comment.
If I understand what we're doing here, resolve is ScalarFunction#resolve which gets called with the function-level deprecation status. When we create the ScalarFunctionVariant, we first check if it has an impl level deprecation status, and if not just use the function-level one.
This is definitely a little lossy in terms of representation, but works reasonable well. Is the reason to inject the deprecation status like this to make it easy to check if ScalarFunctionVariant is deprecated directly without having to access the ScalarFunction holding it? If so, could we do something like inject the function-level deprecation status and store both on the ScalarFunctionImp? That way we still check the deprecation status easily with a method like isDeprecated that looks at the function-level and impl-level deprecation, but when we serialize ScalarFunctionImpl we can ignore the function-level deprecation to avoid setting it for every impl.
There was a problem hiding this comment.
This is definitely a little lossy in terms of representation, but works reasonable well. Is the reason to inject the deprecation status like this to make it easy to check if ScalarFunctionVariant is deprecated directly without having to access the ScalarFunction holding it?
I think the AI wanted to follow the existing pattern where we already copy the function name, description and metadata over from the function (ScalarFunction) to the impl (ScalarFunctionVariant). We could store a pointer to the ScalarFunction inside the ScalarFunctionVariant in general for accessing the function level metadata from the impl more easily.
There was a problem hiding this comment.
The current behavior is already lossy since it overwrites the impl description with the function description.
There was a problem hiding this comment.
I also spent a bit more time trying to understand the existing code. What seems to happen is that we are deserializing the extensions YAMLs into SimpleExtension.ExtensionCollection which then contains most of the structure in the YAML and then the resolve() method copies some of the function level details to the impl level FunctionVariant classes for easier consumption like name, description, metadata, urn, etc.
It feels to me like the resolve step is a bit redundant and we could add pointers to the parent fields instead.
There was a problem hiding this comment.
It feels to me like the resolve step is a bit redundant and we could add pointers to the parent fields instead.
That seems like a better design IMO. I vaguely recalled that resolve was a mechanism to inject context from higher up in the YAML into child objects. Turns out I made some modifications around this like 3 years ago #149
If we just had pointers to the parents, we could avoid it entirely.
There was a problem hiding this comment.
ok, I have a refactoring locally already. can push that up. maybe also include #848 while we refactor this area.
There was a problem hiding this comment.
The current behavior is already lossy since it overwrites the impl description with the function description.
Given that it's already lossy, I would be happy to see this merged and then have the refactor as a follow-up. No need to block this change, and it will make the refactor PR smaller.
Signed-off-by: Niels Pardon <par@zurich.ibm.com>
Signed-off-by: Niels Pardon <par@zurich.ibm.com>
vbarua
left a comment
There was a problem hiding this comment.
Circling back, if we hide the public resolve method I would be happy to see this shipped as is.
| String name, | ||
| String description, | ||
| Optional<Map<String, Object>> metadata, | ||
| Optional<DeprecationStatus> deprecated) { |
There was a problem hiding this comment.
Now that you point it out, it is weird that this is public. Hiding this makes sense to me.
| .args(args()) | ||
| .options(options()) | ||
| .metadata(metadata) | ||
| .deprecated(deprecated().isPresent() ? deprecated() : deprecated) |
There was a problem hiding this comment.
The current behavior is already lossy since it overwrites the impl description with the function description.
Given that it's already lossy, I would be happy to see this merged and then have the refactor as a follow-up. No need to block this change, and it will make the refactor PR smaller.
BREAKING CHANGE: changes the public API of the
ScalarFunctionVariant.resolve()function by adding an additional parameterSummary
Implements support for deprecation information in simple extensions, introduced in Substrait v0.86.0 (substrait-io/substrait#1014).
An optional
deprecatedobject can now be parsed and exposed on:The
deprecatedobject carries a requiredsince(core semantic version string), an optionalreason, and optionalmetadata. As with the existingmetadatasupport, consumers are not required to validate these fields — the information is surfaced so tooling can produce deprecation warnings.Type variations are not modeled in substrait-java's extension schema (
ExtensionSignaturesonly coverstypes,scalar_functions,aggregate_functions,window_functions), so type-variation deprecation is out of scope. I opened #847 to track adding type-variation support to the extension model (which would then also cover their deprecation info).Details
SimpleExtension.DeprecationStatusimmutable value type (since,reason,metadata), following the existingOption/VariadicBehaviorpattern.Optional<DeprecationStatus> deprecated()toType, theScalarFunction/AggregateFunction/WindowFunctionparent classes, and theFunctionbase class (per impl).Testing
DeprecationExtensionTest(7 tests) covering type, function-level, and impl-level deprecation, deprecationmetadata, an overload-specific deprecation where a sibling overload is not deprecated, and carry-over to aggregate-as-window functions.deprecation_extensions.yamltest resource.Closes #811
🤖 Generated with Claude Code