-
Notifications
You must be signed in to change notification settings - Fork 10.5k
feat: "Spanify" DataProtector with IBufferWriter<byte> #64262
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR introduces span-based APIs for DataProtection using IBufferWriter<byte> to improve performance and reduce memory allocations. The implementation adds ISpanDataProtector and ISpanAuthenticatedEncryptor interfaces that enable zero-allocation scenarios through generic type parameters with allows ref struct constraints.
Key changes:
- New span-based interfaces
ISpanDataProtectorandISpanAuthenticatedEncryptorwithIBufferWriter<byte>destination parameters - Implementations across all encryptors (Managed, CNG CBC, CNG GCM, AES GCM)
- Two buffer writer implementations:
PooledArrayBufferWriter<T>(class) andRefPooledArrayBufferWriter<T>(ref struct) - Comprehensive test coverage and benchmarking infrastructure
Reviewed Changes
Copilot reviewed 36 out of 36 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/Shared/PooledArrayBufferWriter.cs | Adds class-based pooled buffer writer implementation with validation logic issue |
| src/Shared/Buffers/RefPooledArrayBufferWriter.cs | Adds ref struct pooled buffer writer with similar validation issue |
| src/DataProtection/Abstractions/src/ISpanDataProtector.cs | Defines span-based data protector interface |
| src/DataProtection/DataProtection/src/AuthenticatedEncryption/ISpanAuthenticatedEncryptor.cs | Defines span-based authenticated encryptor interface |
| src/DataProtection/DataProtection/src/KeyManagement/KeyRingBasedSpanDataProtector.cs | Implements span-based data protector using key ring |
| src/DataProtection/DataProtection/src/Managed/ManagedAuthenticatedEncryptor.cs | Adds span API implementations with unnecessary using directive |
| src/DataProtection/DataProtection/src/Managed/AesGcmAuthenticatedEncryptor.cs | Implements span-based GCM encryption |
| src/DataProtection/DataProtection/src/Cng/*.cs | Implements span-based APIs for CNG encryptors |
| src/DataProtection/samples/KeyManagementSimulator/Program.cs | Updates mock encryptor with typo in parameter name |
| src/DataProtection/DataProtection/test/**/*.cs | Adds comprehensive roundtrip tests |
| src/DataProtection/benchmarks/** | Adds benchmarking project showing performance improvements |
| if (sizeHint <= 0) | ||
| { | ||
| throw new ArgumentOutOfRangeException(nameof(sizeHint), actualValue: sizeHint, $"{nameof(sizeHint)} ('{sizeHint}') must be a non-negative value."); | ||
| } |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The condition sizeHint <= 0 on line 175 is incorrect. It should check if sizeHint < 0 (negative) instead. The next condition on line 180 checks sizeHint == 0, which means the first condition will always throw for sizeHint == 0 before reaching the second condition. This creates unreachable code.
| if (sizeHint <= 0) | ||
| { | ||
| throw new ArgumentOutOfRangeException(nameof(sizeHint), actualValue: sizeHint, $"{nameof(sizeHint)} ('{sizeHint}') must be a non-negative value."); |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same issue exists here: checking sizeHint <= 0 followed by sizeHint == 0 creates unreachable code. The condition should check sizeHint < 0 instead to allow the zero case to be handled by the subsequent condition.
| byte[] IAuthenticatedEncryptor.Encrypt(ArraySegment<byte> plaintext, ArraySegment<byte> _additionalAuthenticatedData) => plaintext.ToArray(); | ||
| public byte[] Encrypt(ArraySegment<byte> plaintext, ArraySegment<byte> _additionalAuthenticatedData) => plaintext.ToArray(); | ||
|
|
||
| public void Encrypt<TWriter>(ReadOnlySpan<byte> plainttext, ReadOnlySpan<byte> additionalAuthenticatedData, ref TWriter destination) where TWriter : System.Buffers.IBufferWriter<byte>, allows ref struct |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in parameter name: should be plaintext instead of plainttext (double 't').
| using System; | ||
| using System.Buffers; | ||
| using System.Diagnostics; | ||
| using System.Drawing; |
Copilot
AI
Nov 11, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary using directive System.Drawing was added. This namespace is not used in the file and should be removed.
| using System.Drawing; |
| void Protect<TWriter>(ReadOnlySpan<byte> plaintext, ref TWriter destination) | ||
| where TWriter : IBufferWriter<byte> | ||
| #if NET | ||
| , allows ref struct |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be better to drop the netstandard support for this high-performance (perhaps unwieldy) API limit ISpanDataProtector and ISpanAuthenticatedEncryptor to .NET 11 and above?
|
|
||
| private void CheckAndResizeBuffer(int sizeHint) | ||
| { | ||
| if (sizeHint <= 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| if (sizeHint <= 0) | |
| if (sizeHint < 0) |
Good catch @copilot. We should add a test case where sizeHint is zero.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I never should have @-mentioned copilot. Of course, it opened a PR because of this 😆
|
|
||
| var previousBuffer = oldBuffer.AsSpan(0, _index); | ||
| previousBuffer.CopyTo(_rentedBuffer); | ||
| previousBuffer.Clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| previousBuffer.Clear(); |
Do we need to clear here? Presumably this will all be overwritten anyway. If we did want to clear, we should probably pass clearArray: true to ArrayPool<byte>.Return right below, but I don't think that's necessary.
| // Optionally clear the buffer before returning to pool (security) | ||
| // Uncomment if needed for sensitive data: | ||
| // _buffer.AsSpan(0, _index).Clear(); | ||
|
|
||
| ArrayPool<T>.Shared.Return(_rentedBuffer, clearArray: false); | ||
| _buffer = null!; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No reason to leave the dead code around or to be explicit about clearArray when not clearing is the well-known default. But if we did want to reduce risk, we could create an ArrayPool using ArrayPool<byte>.Create(). that's used exclusively for DataProtection rather than use the shared one. This would also give us more control over the max size of pooled arrays and the maximum number of arrays to pool.
It's also unclear what nulling out the buffer does. Presumably the stack frame with the ref struct will be popped shortly after this will be disposed.
| // Optionally clear the buffer before returning to pool (security) | |
| // Uncomment if needed for sensitive data: | |
| // _buffer.AsSpan(0, _index).Clear(); | |
| ArrayPool<T>.Shared.Return(_rentedBuffer, clearArray: false); | |
| _buffer = null!; | |
| ArrayPool<T>.Shared.Return(_rentedBuffer); |
| } | ||
|
|
||
| [MethodImpl(MethodImplOptions.AggressiveInlining)] | ||
| private void ThrowIfDisposed() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we ever give user code access to this IBufferWriter<T>? If not, the most efficient thing would probably be to replace this with a Debug.Assert.
| public void Advance(uint count) | ||
| => Advance((int)count); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| public void Advance(uint count) | |
| => Advance((int)count); |
Other than tests, don't we always have to call the IBufferWriter overload that takes int anyway? I see a bunch of calls to both Advance(checked((int) and just Advance((int). We should probably be more consistent with it. Since we reject any count below zero, I think unchecked conversions from uint are fine.
| // no rented buffer initially - only if we need to grow over the limits of initialBuffer | ||
| _rentedBuffer = null!; | ||
|
|
||
| _buffer = initialBuffer; | ||
| _index = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this comment is necessary once we properly mark the field as nullable.
| // no rented buffer initially - only if we need to grow over the limits of initialBuffer | |
| _rentedBuffer = null!; | |
| _buffer = initialBuffer; | |
| _index = 0; | |
| _buffer = initialBuffer; |
| var buffer = destination.GetSpan(checked((int)dwRequiredSize)); | ||
|
|
||
| // Clone IV again for the actual decryption call | ||
| byte* pbClonedIV2 = stackalloc byte[checked((int)_symmetricAlgorithmBlockSizeInBytes)]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the second call to checked((int)_symmetricAlgorithmBlockSizeInBytes). While checked operations aren't that expensive, it's probably good to reuse the operation or just do the unchecked cast.
| byte* pbClonedIV2 = stackalloc byte[checked((int)_symmetricAlgorithmBlockSizeInBytes)]; | |
| byte* pbClonedIV2 = stackalloc byte[(int)_symmetricAlgorithmBlockSizeInBytes]; |
| CryptoUtil.Assert(dwActualDecryptedByteCount <= dwEstimatedDecryptedByteCount, "dwActualDecryptedByteCount <= dwEstimatedDecryptedByteCount"); | ||
| if (dwActualDecryptedByteCount == dwEstimatedDecryptedByteCount) | ||
| #else | ||
| var pooledArrayBuffer = new PooledArrayBufferWriter<byte>(outputSize); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any reason we cannot use using for these?
| var pooledArrayBuffer = new PooledArrayBufferWriter<byte>(outputSize); | |
| using var pooledArrayBuffer = new PooledArrayBufferWriter<byte>(outputSize); |
|
|
||
| // Assumption: pbCipherText := { keyModifier | IV | encryptedData | MAC(IV | encryptedPayload) } | ||
| #if NET | ||
| byte[] rentedBuffer = null!; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a general rule, we shouldn't use the null-forgiving operator when initializing variables/fields/properties, unless we're intentionally lying about the nullability of a public property that's always non-null by the time the user sees it, and even that's super questionable. It's better to save the null-forgiving until later in the code when we know for sure the variable/field/property must not be null. In this case, the only other place we use it we have a if (rentedBuffer is not null) anyways.
| byte[] rentedBuffer = null!; | |
| byte[]? rentedBuffer = null; |
This is really making me wish C# had macros though. I wonder if we should just have another overload to RefPooledArrayBufferWriter that takes an initial size other than a stackalloced Span. Not only would it simplify a lot of this repeated code a little bit, this would ensure the RefPooledArrayBufferWriter is always be responsible for renting and returning arrays to the pool.
Please, review with caution! This is cryptography and failures here can be pretty dramatic...
This is a 2nd iteration over spanification of DataProtection. 1st iteration used a different pattern:
GetSize()followed by actual operation (protection or unprotection). Here we introduceIBufferWriter<byte>as destination parameter.The following idea came up due to the internals of DataProtection: it may happen (though it's rare) that key changes during the user's interaction with DataProtection API, and suddenly the destination buffer can be of an insufficient size. Consider:
However, we can simply solve it by using
IBufferWriter<byte>- since it has an ability to expand its size on-the-fly.We have experimented with the API form a lot, but decided to go with such API:
and
New API has a few benefits and intentions:
IBufferWriter<byte>avoids an undeterministic scenario where buffer may be not of enough sizeGetSize()thenProtect()/Unprotect()with not 100% guarantee that works, which means a fallback to allocatey API is required).TWriterwith a constraint to beIBufferWriter<byte>to allow passing in structs for even further optimization (reduction of allocation of a ref-type for buffer instance)refto allows passing in a mutable ref struct. That should help in most of the cases DataProtection was designed to be used for - like Antiforgery where the input is quite small (under 100 bytes for instance), and buffer for encryption can be stackalloc'ed (which is probably the most performant mechanism to operate with a temporary buffer)Artificial benchmarks show expected results: ref struct which uses stackalloc'ed buffer is the fastest way to operate.
That is also shown in the real benchmarks I've done on windows machine comparing original implementation, passing in a buffer which is a class, and a buffer whcih is a ref struct. With ref struct we get the most perf and use the least allocations.
Fixes #44758