Conversation
There was a problem hiding this comment.
Pull request overview
This PR optimizes string.Split(char, ...) by adding a dedicated fast path for the most common scenario: splitting on a single separator character, including a new vectorized implementation for scanning separator positions.
Changes:
- Added a single-separator (
separators.Length == 1) specialized path inMakeSeparatorListAny. - Introduced a new
MakeSeparatorListVectorized(..., char c)overload to vectorize scanning for a single separator char. - Adjusted the existing 2–3 separator path to assume
separators.Length >= 2(since the 1-separator case is now handled earlier).
This comment was marked as resolved.
This comment was marked as resolved.
…or non-existent, which is similar to what IndexOf does
This comment was marked as outdated.
This comment was marked as outdated.
src/libraries/System.Private.CoreLib/src/System/String.Manipulation.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/String.Manipulation.cs
Outdated
Show resolved
Hide resolved
| else if (separators.Length <= 1) | ||
| { | ||
| char sep0 = separators[0]; | ||
| if (Vector128.IsHardwareAccelerated && source.Length >= Vector128<ushort>.Count * 2) | ||
| { | ||
| MakeSeparatorListVectorized(source, ref sepListBuilder, sep0); | ||
| return; | ||
| } | ||
|
|
There was a problem hiding this comment.
This introduces a new single-separator fast path and a new vectorized implementation. Please add/extend tests that exercise string.Split(char, ...) for a single separator with input lengths around SIMD boundaries (e.g., below/at/above Vector128<ushort>.Count * 2, and similarly for 256/512 when available) to ensure correctness for long strings and to guard against out-of-range reads in vectorized paths.
There was a problem hiding this comment.
@copilot open a new pull request to apply changes based on this feedback and ensure the tests are also applicable considering the changes that have followed since.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
src/libraries/System.Private.CoreLib/src/System/String.Manipulation.cs
Outdated
Show resolved
Hide resolved
|
@EgorBot -amd -arm using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
using System.IO;
using System.Threading;
using System.Threading.Tasks;
BenchmarkSwitcher.FromAssembly(typeof(Bench).Assembly).Run(args);
// Test code from the C# Discord (not mine)
public class Bench
{
[Benchmark]
[Arguments("testttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt")]
[Arguments("ttt,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdsat,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasds")]
[Arguments("abcdef")] // mine
[Arguments("abc,def")] // mine
public string[] StringSplitByString(string str)
{
return str.Split(",");
}
[Benchmark]
[Arguments("testttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttttt")]
[Arguments("ttt,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdsat,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasdst,ts,afd,sadasdsad,sa,f,safasf,sd,fsa,f,sa,fd,s,a,dsa,d,sa,d,safs,a,,,das,dsafda,geadwadasdwad,a,sd,wa,s,d,wa,sd,wa,s,dw,asdw,a,sd,wa,d,wasds")]
[Arguments("abcdef")] // mine
[Arguments("abc,def")] // mine
public string[] StringSplitByChar(string str)
{
return str.Split(',');
}
} |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.
Comments suppressed due to low confidence (1)
src/libraries/System.Private.CoreLib/src/System/SpanHelpers.Packed.cs:1147
- These PackSources helpers were made internal to support new callers outside PackedSpanHelpers, but the file-level comment indicates this type exists to hide "private helpers". Consider updating the nearby commentary / documenting the intended usage constraints (e.g., only call when the relevant ISA is supported and CanUsePackedIndexOf has been validated) to avoid future misuse within the assembly.
[MethodImpl(MethodImplOptions.AggressiveInlining)]
[CompExactlyDependsOn(typeof(Avx512BW))]
internal static Vector512<byte> PackSources(Vector512<short> source0, Vector512<short> source1)
{
Debug.Assert(Avx512BW.IsSupported);
// Pack two vectors of characters into bytes. While the type is Vector256<short>, these are really UInt16 characters.
// X86: Downcast every character using saturation.



In some cases
string.Split(char, ...)was slower (significantly) than the singlestringseperator overload - this PR optimizes thecharoverload/s to have much closer performance (still a small gap in 1 case) - the single-charoverloads are the most common case, so I have added the additional optimisation just for that for now - benchmarks below.Main benchmark that was the focus - single char splitter with rare/no seperators (17% improvement on arm, 47% improvement on x64) -
charalmost matches or outperformsstringnow (could probably get it to match by adjusting the branches carefully - I can implement if wanted, but it will probably make it less readable):X64 (AMD):

Arm:

Other cases have not regressed meaningfully (within margin of error) - some have even possibly improved slightly on x64: EgorBot/Benchmarks#28