sort: merge lazily in UTF-8 locales instead of precomputing sort keys#13205
sort: merge lazily in UTF-8 locales instead of precomputing sort keys#13205sylvestre wants to merge 1 commit into
Conversation
In a UTF-8 locale `sort` precomputes a full ICU collation sort key for every line while parsing chunks. The regular sort path needs this to amortize O(n log n) comparisons, but merging only performs O(n log k) comparisons (and none at all when merging a single already-sorted file), so the keys are pure overhead there. Add merge_compare(), which mirrors compare_by() but compares whole-line locale keys lazily with the collator. The merge reader now runs with fast_locale_collation disabled so Line::create no longer computes per-line sort keys, and the merge comparator and dedup use merge_compare. `sort -m /usr/share/dict/words` in a UTF-8 locale goes from ~70ms to ~9ms (10.5x -> 1.32x vs GNU sort), with byte-identical output. Other sort modes and the C locale are unaffected.
|
GNU testsuite comparison: |
Merging this PR will degrade performance by 6.65%
|
| Mode | Benchmark | BASE |
HEAD |
Efficiency | |
|---|---|---|---|---|---|
| ❌ | Simulation | merge_pre_sorted_files |
103 ms | 113.9 ms | -9.58% |
| ❌ | Simulation | merge_pre_sorted_files_utf8_locale |
102.8 ms | 113.6 ms | -9.55% |
| ❌ | Simulation | sort_ascii_utf8_locale |
15.4 ms | 16.1 ms | -4.67% |
| ❌ | Simulation | sort_ascii_c_locale |
16 ms | 16.7 ms | -4.67% |
| ❌ | Simulation | sort_key_field[500000] |
767.1 ms | 804.3 ms | -4.63% |
Tip
Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.
Comparing sylvestre:sort-merge-perf (c5c51af) with main (b76d615)
Footnotes
-
46 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports. ↩
|
|
||
| /// Comparison used by the merge path. | ||
| /// | ||
| /// This is result-identical to [`compare_by`], but for the whole-line locale-collation |
There was a problem hiding this comment.
The compare_by link makes the Documentation/warnings job fail:
public documentation for `merge_compare` links to private item `compare_by`
--> src/uu/sort/src/sort.rs:2626:35
|
2626 | /// This is result-identical to [`compare_by`], but for the whole-line locale-collation
| ^^^^^^^^^^ this item is private
| pub fn merge_compare<'a>( | ||
| a: &Line<'a>, | ||
| b: &Line<'a>, | ||
| settings: &GlobalSettings, | ||
| a_line_data: &LineData<'a>, | ||
| b_line_data: &LineData<'a>, | ||
| ) -> Ordering { |
There was a problem hiding this comment.
The function signature looks a bit odd with settings in the middle, I would have expected it at the end. But I think it's something for a future PR.
sort -m /usr/share/dict/wordsin a UTF-8 locale goes from ~70ms to ~9ms (10.5x -> 1.32x vs GNU sort), with byte-identical output. Other sort modes and the C locale are unaffected.