Skip to content

cut: support multibyte characters in non-UTF-8 locales#13194

Open
sylvestre wants to merge 1 commit into
uutils:mainfrom
sylvestre:mb-non-utf8
Open

cut: support multibyte characters in non-UTF-8 locales#13194
sylvestre wants to merge 1 commit into
uutils:mainfrom
sylvestre:mb-non-utf8

Conversation

@sylvestre

Copy link
Copy Markdown
Contributor

Add locale-aware character handling so -c counts characters, -b -n keeps multibyte characters whole, and -d accepts a single multibyte delimiter.

Should make test tests/cut/mb-non-utf8.sh pass

@github-actions

github-actions Bot commented Jun 29, 2026

Copy link
Copy Markdown

GNU testsuite comparison:

Congrats! The gnu test tests/cut/mb-non-utf8 is no longer failing!
Note: The gnu test tests/misc/write-errors was skipped on 'main' but is now failing.

Comment thread tests/by-util/test_cut.rs Outdated
Comment thread tests/by-util/test_cut.rs Outdated
Comment thread tests/by-util/test_cut.rs
Comment thread tests/by-util/test_cut.rs Outdated
Comment thread tests/by-util/test_cut.rs
Add locale-aware character handling so -c counts characters, -b -n keeps
multibyte characters whole, and -d accepts a single multibyte delimiter.

Should make test tests/cut/mb-non-utf8.sh pass
@codspeed-hq

codspeed-hq Bot commented Jun 29, 2026

Copy link
Copy Markdown

Merging this PR will degrade performance by 87.13%

⚠️ Different runtime environments detected

Some benchmarks with significant performance changes were compared across different runtime environments,
which may affect the accuracy of the results.

Open the report in CodSpeed to investigate

❌ 1 regressed benchmark
✅ 322 untouched benchmarks
⏩ 46 skipped benchmarks1

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation cut_characters 7.4 ms 57.6 ms -87.13%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing sylvestre:mb-non-utf8 (350f371) with main (43ef8d0)

Open in CodSpeed

Footnotes

  1. 46 benchmarks were skipped, so the baseline results were used instead. If they were deleted from the codebase, click here and archive them to remove them from the performance reports.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants