[FLINK-39821][table] Make REGEXP_REPLACE return type nullable#28293
Merged
Conversation
REGEXP_REPLACE used nullableIfArgs, so the planner inferred a NOT NULL output when all three arguments were NOT NULL. The runtime returns null for a non-literal regex that fails to compile, since a column reference or CONCAT result is only validated at runtime, not at planning time. That let a null value flow through a column the planner believed was non-null. Switches the output type strategy to explicit(STRING().nullable()), matching REGEXP_EXTRACT which is nullable for the same reason. REGEXP keeps nullableIfArgs because its runtime returns false, not null, on an invalid pattern. Adds a RegexpFunctionsITCase case with NOT NULL arguments and a non-literal invalid regex, asserting the output type stays nullable and the value is null.
Collaborator
dylanhz
approved these changes
Jun 3, 2026
dylanhz
left a comment
Contributor
There was a problem hiding this comment.
Thanks for the quick fix, LGTM!
fhueske
approved these changes
Jun 3, 2026
fhueske
left a comment
Contributor
There was a problem hiding this comment.
Thanks for the fix!
I'll merge this 🙌
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Issue was mentioned post-merge in this PR: #28189 (comment)
What is the purpose of the change
REGEXP_REPLACEdeclared its output vianullableIfArgs, so the planner inferred aNOT NULLresult when all three arguments wereNOT NULL. But the runtime returnsnullfor anon-literal regex that fails to compile: a column reference or a
CONCAT(...)result is only validated at runtime, not at planning time. That let anullflow through a column theplanner believed was non-null, which is unsound.
This is a sub-task of the FLINK-39648 umbrella.
Brief change log
BuiltInFunctionDefinitions.REGEXP_REPLACEoutput type strategy fromnullableIfArgs(explicit(STRING()))toexplicit(DataTypes.STRING().nullable()), matchingREGEXP_EXTRACT, which is always-nullable for the same reason. (REGEXPkeepsnullableIfArgsbecause it returnsfalse, notnull, on an invalid pattern.)Verifying this change
This change added a test and can be verified as follows:
RegexpFunctionsITCasecase withNOT NULLarguments and a non-literal invalid regex, asserting the output type stays nullable and the value isnull.RegexpFunctionsITCase,ScalarFunctionsTest,SqlExpressionTest,MultiJoinTest(plan digests unchanged), andExpressionSerializationTestregress the function.Does this pull request potentially affect one of the following parts:
@Public(Evolving): noDocumentation
Was generative AI tooling used to co-author this PR?
Generated-by: Opus 4.8