Skip to content

fix: preserve precision in log() with mixed float-width arguments#23310

Draft
raphaelroshan wants to merge 1 commit into
apache:mainfrom
raphaelroshan:fix-22581-log-float32-precision
Draft

fix: preserve precision in log() with mixed float-width arguments#23310
raphaelroshan wants to merge 1 commit into
apache:mainfrom
raphaelroshan:fix-22581-log-float32-precision

Conversation

@raphaelroshan

Copy link
Copy Markdown

Which issue does this PR close?

Rationale for this change

log(base, value) currently dispatches on the type of the value (the last argument): return_type inspects only the value's type, and invoke_with_args computes the result in that type. When the base is a wider float than the value — e.g. log(Float64, Float32) — the base is narrowed down to the value's float type before the computation, which loses precision. This is the behavior reported in #22581.

The fix picks the widest float among all arguments as the result type, and widens a Float16/Float32 value up to that type before computing, so the base no longer has to be narrowed.

AI-assisted disclosure: I used an AI assistant while working on this. I understand and can justify the change end-to-end. One thing I'd flag for reviewers: I treat any non-float argument (e.g. an integer base, or decimals which are already computed in f64) as Float64 for ranking purposes via a float_rank helper — this matches the existing "decimals compute in f64" path, but please sanity-check that assumption against any type-coercion expectations I might be missing.

What changes are included in this PR?

  • return_type now returns the widest float across all arguments instead of only looking at the value's type (via a small float_rank helper).
  • In invoke_with_args, when the base is wider than a Float16/Float32 value, the value is cast up to the result type before computation so the base is not narrowed. Decimal values are left untouched (still computed in f64).
  • Added a unit test (test_log_f64_base_f32_value) and sqllogictest cases in math.slt covering log(Float64, Float32), log(Float64, Float16), and integer-base cases.

Are these changes tested?

Yes.

  • cargo test -p datafusion-functions --lib math:: — green (includes the new test_log_f64_base_f32_value).
  • math.slt sqllogictests — green (includes new mixed-width cases and updated expected types for previously-narrowing cases).

Are there any user-facing changes?

Yes — this is a user-facing behavior change to log() return types in mixed-float-width cases, and I'd like maintainers to confirm the desired semantics.

Previously the return type of log(base, value) followed the value's type. Now it follows the widest float among all arguments. Concretely:

  • log(Float64, Float32) now returns Float64 (was Float32)
  • log(Float64, Float16) now returns Float64 (was Float16)
  • log(2, arrow_cast(2.0, 'Float32')) (integer base) now returns Float64 (was Float32)

The unary log(value) and same-width cases (log(Float64, Float64), log(Float32, Float32), etc.) are unchanged. Because output precision/width changes in the mixed cases, some existing math.slt expected values were updated accordingly.

Opening this as a draft proposal because widening the return type is a semantics decision: it fixes the precision loss in #22581, but it changes observed output types/precision for existing mixed-width queries. Does widening to the widest float match the intended log semantics, or would you prefer preserving the value's type (and accepting the precision loss, or handling it differently)? Happy to adjust. If this is considered a breaking change to the public output type, I can add the api change label.

Previously log(base, value) computed and returned its result in the
value's float type. When the base was a wider float (e.g.
log(Float64, Float32)), the base was narrowed to the value's type,
losing precision.

The result type is now the widest float among the arguments, and a
Float16/Float32 value is widened to that type before computation so the
base is no longer narrowed.

Closes apache#22581
@github-actions github-actions Bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Jul 3, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

log(base, val) loses precision when value is Float32

1 participant