Skip to content

fix: Fix bug in array_has scalar path with sliced arrays#20677

Open
neilconway wants to merge 3 commits intoapache:mainfrom
neilconway:neilc/array-has-slice-bug
Open

fix: Fix bug in array_has scalar path with sliced arrays#20677
neilconway wants to merge 3 commits intoapache:mainfrom
neilconway:neilc/array-has-slice-bug

Conversation

@neilconway
Copy link
Contributor

@neilconway neilconway commented Mar 3, 2026

Which issue does this PR close?

N/A

Rationale for this change

In #20374, array_has with a scalar needle was optimized to reconstruct matches more efficiently. Unfortunately, that code was incorrect for sliced arrays: values() returns the entire value buffer (including elements outside the visible slice), so we need to skip the corresponding indexes in the result bitmap.

We could fix this by just skipping indexes, but it seems more robust and efficient to arrange to not compare the needle against elements outside the visible range in the first place.

array_position has a similar behavior (introduced in #20532): it didn't have the buggy behavior, but it still did extra work for sliced arrays by comparing against elements outside the visible range.

Benchmarking the revised code, there is no performance regression for unsliced arrays.

What changes are included in this PR?

  • Fix array_has bug for sliced arrays with scalar needle
  • Improve array_has and array_position to not compare against elements outside the visible range of a sliced array
  • Add unit test for array_has bug
  • Add unit test to increase confidence in array_position behavior for sliced arrays

Are these changes tested?

Yes.

Are there any user-facing changes?

No.

@github-actions github-actions bot added the functions Changes to functions implementation label Mar 3, 2026
@neilconway neilconway changed the title fix: Fix bug in array_has scalar path with sliced arrays fix: Fix bug in array_has scalar path with sliced arrays Mar 3, 2026
Copy link
Contributor

@comphead comphead left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neilconway, good job as usual!

}

#[test]
fn test_array_has_sliced_list() -> Result<(), DataFusionError> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

double checked it is failing in main, this fix would be nice backporting to #19692

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. I haven't done a backport before, but I'd guess we should wait for this to land in main, then I'll open a new PR to backport to the 53 branch?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants