Skip to content

Conversation

@pav-kv
Copy link
Collaborator

@pav-kv pav-kv commented Nov 9, 2025

The mergeSpans function used for SortAndDedup calls on SpanSet is not tested and not clear about the semantics when the input slice of Span@Timestamp has keys that are present at multiple timestamps. In the latter case, mergeSpans does not output a minimal / "canonical" set of spans.

This does not look good because the result of this call is fed to the latch manager, i.e. this function is on the critical path for CRDB correctness / performance. Adding a test revealed a bug in this function that could result in placing a wider latch than was requested. This PR fixes this bug, adds tests, clarifies the semantics of and refactors mergeSpans (and the roachpb.MergeSpans that it is mostly a copy of) into a readable state.

Epic: none
Release note (bug fix): a bug in key span merging could result in placing wider latches than needed for a request, which could impose unnecessary contention. The effect of this bug hasn't been observed in the wild, and possibly never happened.

Epic: none
Release note: none
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@pav-kv pav-kv changed the title spanset: simplify mergeSpans spanset: fix and simplify mergeSpans Nov 9, 2025
if cur.Timestamp != prev.Timestamp {
r = append(r, cur)
} else if len(prev.EndKey) == 0 { // prev is a point key
if !cur.Key.Equal(prev.Key) { // cur.Key > prev.Key
Copy link
Collaborator Author

@pav-kv pav-kv Nov 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this func goes an extra mile to merge [a,b) + [b] = [a,b] (see below). However, it doesn't handle cases like two consecutive point keys:

[a] + [a.Next()] = [a,a.Next().Next())

I wonder if we should do it for completeness. Today:

[a,b) + [b] + [b.Next()] = [a,b.Next().Next())
        [b] + [b.Next()] = [b] + [b.Next()]

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a commit that fixes it, but I'll send it separately after this PR merges. As is, the PR does not change the behaviour of mergeSpans.

@pav-kv pav-kv force-pushed the spanset-tweaks branch 3 times, most recently from b73d78c to 38d2e14 Compare November 10, 2025 10:29
Epic: none
Release note: none
It was passed by pointer to avoid allocation when converted to
sort.Interface, but it's no longer needed with slices.SortFunc.

Epic: none
Release note: none
Epic: none
Release note: none
Epic: none
Release note: none
Epic: none
Release note: none
@pav-kv pav-kv force-pushed the spanset-tweaks branch 2 times, most recently from 8f9764e to cb4c5d0 Compare November 10, 2025 10:49
@pav-kv pav-kv marked this pull request as ready for review November 10, 2025 10:49
@pav-kv pav-kv requested review from a team as code owners November 10, 2025 10:49
@pav-kv pav-kv requested review from jbowens, msbutler, stevendanna and tbg and removed request for a team November 10, 2025 10:49
Copy link
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I like this cleanup.

For the commits with "simplify" I've mostly tried to just verify that I was able to follow the logic without thinking too hard and I think this accomplishes that.

One thing I've seen us do elsewhere is list out the cases in a block comment above where it is a bit easier to spell them out with labels:

// CASE 1: [a] + [b, any)    = [a], [b, any)
// CASE 2: [a] + [a, any)    = [a, any)
// CASE 3: [a, c) + [d, any) = [a,c], [d,any)
// CASE 4: [a, c) + [c] = [a, c.Next())
// CASE 5: [a, c) + [c, d) = [a, d)
// CASE 6: [a, c) + [b, d) = [a, d)
// CASE 7: [a, c) + [b-bb) = [a, c)

And then inline refer to the case labels.

Not recommending this as I think your inline comments made it straightforward to follow. Just noting it for your consideration in case you prefer it.

Comment on lines +220 to +221
// TODO(pav-kv): does this make CheckAllowed falsely fail in some cases? Maybe
// it's fine: importantly, it should not falsely succeed.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example of what you are considering here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something similar to the concern under TODO(irfansharif) below. Say there are 2 spans [a-b)@10 [b-c)@10 that haven't merged due to mergeSpans best-effortness. If there is a read of span [a-c), CheckAllowed can fail due to not finding any whole span that includes [a-c), even though it's technically allowed here.

I'll address this as part of TODO(pav-kv) next to mergeSpans saying that we can sort by (timestamp, key) and achieve "canonicalization" of this set. Seemingly, the callers don't rely on this being sorted by key (and I also confirmed that by running all kvserver tests locally with this change).

Epic: none
Release note: none
Epic: none
Release note: none
All branches in this loop check timestamp at the end. This was
unnecessarily verbose, error-prone, and led to a bug fixed in the
previous commit. Check the timestamp at the beginning once instead.

Epic: none
Release note: none
Epic: none
Release note: none
Epic: none
Release note: none
Epic: none
Release note: none
Copy link
Collaborator Author

@pav-kv pav-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

@stevendanna the CASE comments look nice, I've adopted the "notation" from it in the inlined comments.

Comment on lines +220 to +221
// TODO(pav-kv): does this make CheckAllowed falsely fail in some cases? Maybe
// it's fine: importantly, it should not falsely succeed.
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something similar to the concern under TODO(irfansharif) below. Say there are 2 spans [a-b)@10 [b-c)@10 that haven't merged due to mergeSpans best-effortness. If there is a read of span [a-c), CheckAllowed can fail due to not finding any whole span that includes [a-c), even though it's technically allowed here.

I'll address this as part of TODO(pav-kv) next to mergeSpans saying that we can sort by (timestamp, key) and achieve "canonicalization" of this set. Seemingly, the callers don't rely on this being sorted by key (and I also confirmed that by running all kvserver tests locally with this change).

@pav-kv
Copy link
Collaborator Author

pav-kv commented Nov 10, 2025

bors r=stevendanna

craig bot pushed a commit that referenced this pull request Nov 10, 2025
157105: spanset: fix and simplify mergeSpans r=stevendanna a=pav-kv

The `mergeSpans` function used for `SortAndDedup` calls on `SpanSet` is not tested and not clear about the semantics when the input slice of `Span@Timestamp` has keys that are present at multiple timestamps. In the latter case, `mergeSpans` does not output a minimal / "canonical" set of spans.

This does not look good because the result of this call is fed to the latch manager, i.e. this function is on the critical path for CRDB correctness / performance. Adding a test revealed a bug in this function that could result in placing a wider latch than was requested. This PR fixes this bug, adds tests, clarifies the semantics of and refactors `mergeSpans` (and the `roachpb.MergeSpans` that it is mostly a copy of) into a readable state.

Epic: none
Release note (bug fix): a bug in key span merging could result in placing wider latches than needed for a request, which could impose unnecessary contention. The effect of this bug hasn't been observed in the wild, and possibly never happened.

Co-authored-by: Pavel Kalinnikov <pavel@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Nov 10, 2025

Build failed:

@pav-kv
Copy link
Collaborator Author

pav-kv commented Nov 10, 2025

bors retry

@craig
Copy link
Contributor

craig bot commented Nov 10, 2025

@craig craig bot merged commit 6471b37 into cockroachdb:master Nov 10, 2025
24 checks passed
@pav-kv pav-kv deleted the spanset-tweaks branch November 10, 2025 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants