Skip to content

Conversation

@viirya
Copy link
Member

@viirya viirya commented Dec 28, 2025

Which issue does this PR close?

  • Closes #.

Rationale for this change

What changes are included in this PR?

Use a reusable String buffer instead of allocating a new String for each row. This optimization achieves 17-32% performance improvement across different string types and sizes by avoiding per-row allocations.

Benchmark results:

Benchmark Array Size String Length Baseline (µs) Optimized (µs) Improvement
string_view 1024 32 32.53 22.32 31.4% faster
string 1024 32 31.89 21.49 32.6% faster
large_string 1024 32 31.75 22.01 30.7% faster
string_view 1024 128 49.51 36.11 27.1% faster
string 1024 128 48.91 34.90 28.6% faster
large_string 1024 128 49.78 35.42 28.8% faster
string_view 4096 32 133.67 95.93 28.2% faster
string 4096 32 131.48 91.73 30.2% faster
large_string 4096 32 129.61 92.82 28.4% faster
string_view 4096 128 191.50 153.74 19.7% faster
string 4096 128 185.27 149.37 19.4% faster
large_string 4096 128 187.82 154.32 17.8% faster

Are these changes tested?

Are there any user-facing changes?

@github-actions github-actions bot added the functions Changes to functions implementation label Dec 28, 2025
@viirya viirya force-pushed the replace_optimize_v2 branch 2 times, most recently from 3e29ea3 to 3f8c9bb Compare December 28, 2025 18:54
@viirya viirya marked this pull request as draft December 28, 2025 19:21
Use a reusable String buffer instead of allocating a new String for each
row. This optimization achieves 17-32% performance improvement across different
string types and sizes by avoiding per-row allocations.

Benchmark results:
| Benchmark     | Array Size | String Length | Baseline (µs) | Optimized (µs) | Improvement |
|---------------|------------|---------------|---------------|----------------|-------------|
| string_view   | 1024       | 32            | 32.53         | 22.32          | 31.4% faster|
| string        | 1024       | 32            | 31.89         | 21.49          | 32.6% faster|
| large_string  | 1024       | 32            | 31.75         | 22.01          | 30.7% faster|
| string_view   | 1024       | 128           | 49.51         | 36.11          | 27.1% faster|
| string        | 1024       | 128           | 48.91         | 34.90          | 28.6% faster|
| large_string  | 1024       | 128           | 49.78         | 35.42          | 28.8% faster|
| string_view   | 4096       | 32            | 133.67        | 95.93          | 28.2% faster|
| string        | 4096       | 32            | 131.48        | 91.73          | 30.2% faster|
| large_string  | 4096       | 32            | 129.61        | 92.82          | 28.4% faster|
| string_view   | 4096       | 128           | 191.50        | 153.74         | 19.7% faster|
| string        | 4096       | 128           | 185.27        | 149.37         | 19.4% faster|
| large_string  | 4096       | 128           | 187.82        | 154.32         | 17.8% faster|
@viirya viirya force-pushed the replace_optimize_v2 branch from 3f8c9bb to c10d410 Compare December 28, 2025 19:36
@viirya viirya marked this pull request as ready for review December 28, 2025 19:37
@viirya viirya requested a review from andygrove December 29, 2025 01:25
Ok(Arc::new(result) as ArrayRef)
/// Helper function to perform string replacement into a reusable String buffer
#[inline]
fn replace_into_string(buffer: &mut String, string: &str, from: &str, to: &str) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we lost other optimization tricks that the Rust replace function provides; for example looking at the code: https://doc.rust-lang.org/src/alloc/str.rs.html#268

It has a Fast path for replacing a single ASCII character with another. I wonder if we capture this in our benchmarks?

Copy link
Member Author

@viirya viirya Dec 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent point! Thanks for the review. The single ASCII character fast path optimization has been added.

@viirya viirya force-pushed the replace_optimize_v2 branch from abdede8 to 39ae000 Compare December 29, 2025 06:56
Add a fast path for replacing single ASCII characters with another single
ASCII character, matching Rust's str::replace() optimization. This enables
vectorization and avoids UTF-8 boundary checking overhead.

Changes:
- Added ASCII character detection in replace_into_string()
- When both 'from' and 'to' are single ASCII bytes, use direct byte mapping
- Updated benchmark to include single ASCII character replacement tests

Optimization:
- Fast path operates directly on bytes using simple map operation
- Compiler can vectorize the byte-wise replacement
- Avoids overhead of match_indices() pattern matching for this common case

Benchmark Results (Single ASCII Character Replacement) against previous commit:
- size=1024, str_len=32:  29.5 µs → 21.4 µs (27% faster)
- size=1024, str_len=128: 73.9 µs → 23.4 µs (68% faster)
- size=4096, str_len=32:  121.8 µs → 85.6 µs (30% faster)
- size=4096, str_len=128: 316.9 µs → 83.8 µs (74% faster)

The optimization shows exceptional 27-74% improvements, with the benefit
scaling dramatically with string length. For 128-character strings, we
achieve over 3x speedup by enabling vectorization and eliminating
pattern matching overhead.

This addresses reviewer feedback about capturing Rust's str::replace()
optimization tricks for single ASCII character replacements.
@viirya viirya force-pushed the replace_optimize_v2 branch from 39ae000 to 4f36125 Compare December 29, 2025 07:11
@rluvaton rluvaton added the performance Make DataFusion faster label Dec 29, 2025
Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice speedups! Thanks @viirya.

@viirya viirya added this pull request to the merge queue Dec 29, 2025
Merged via the queue into apache:main with commit 94709dc Dec 29, 2025
31 checks passed
@viirya
Copy link
Member Author

viirya commented Dec 29, 2025

Thanks @andygrove @Jefffrey

@viirya viirya deleted the replace_optimize_v2 branch December 29, 2025 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

functions Changes to functions implementation performance Make DataFusion faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants