Skip to content

Conversation

@kumarUjjawal
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

When calculating the median of an even-length array of integers, averaging the two middle values using add_wrapping causes incorrect results due to integer overflow.

For example, with Int8 values -85 and -56:

Expected: (-85 + -56) / 2 = -70
Actual: -85 + -56 = -141 wraps to 115, then 115 / 2 = 57

What changes are included in this PR?

  • Fix calculate_median : Use add_checked to detect overflow, and fall back to a safe midpoint formula a/2 + b/2 + ((a%2 + b%2) / 2) when overflow occurs.

  • Add tests

Are these changes tested?

  • All previous tests pass
  • Added new tests

Are there any user-facing changes?

@github-actions github-actions bot added sqllogictest SQL Logic Tests (.slt) functions Changes to functions implementation labels Dec 27, 2025
@github-actions github-actions bot added the core Core DataFusion crate label Dec 27, 2025
Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, these tests look great.

One suggestion, maybe add a test where we max out the values for the type (e.g. i8::MAX) on both operands to see how the code would handle those cases?

Copy link
Contributor

@petern48 petern48 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I know I wrote it in the issue title, but how about we remove the integer truncation part from the PR title, since this only fixes the overflow part?

@kumarUjjawal kumarUjjawal changed the title fix: Median() encountered integer overflow and truncates integer results fix: Median() encountered integer overflow Dec 27, 2025
@kumarUjjawal kumarUjjawal changed the title fix: Median() encountered integer overflow fix: Median() integer overflow Dec 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate functions Changes to functions implementation sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

bug: Median() encountered integer overflow and truncates integer results

3 participants