eventservice: split large scanned transactions#5511
Conversation
|
Skipping CI for Draft Pull Request. |
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Code Review
This pull request introduces support for row-level resume tokens and large transaction spilling in the event service. It adds a new EventIteratorWithScanPosition interface, updates the event store iterator to track and return opaque scan positions, and implements disk-based spilling for large transactions via largeTxnInsertSpill and largeTxnScanState. Review feedback identifies two key issues in large_txn_spill.go: a potential resource leak where a failed Close() in Cleanup() skips deleting the temporary file, and a loose record length check that could lead to OOM panics if the spill file is corrupted.
Important
The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.
| if err := s.Close(); err != nil { | ||
| return err | ||
| } | ||
| if s.cleaned { | ||
| return nil | ||
| } | ||
| s.cleaned = true | ||
| if s.path == "" { | ||
| return nil | ||
| } | ||
|
|
||
| err := os.Remove(s.path) | ||
| if err != nil && !os.IsNotExist(err) { | ||
| return errors.Trace(err) | ||
| } | ||
| return nil |
There was a problem hiding this comment.
If s.Close() returns an error, Cleanup() will return early and skip the execution of os.Remove(s.path). This will cause the temporary spill file to be leaked on disk. We should ensure that os.Remove is always attempted even if s.Close() fails, and that s.cleaned is set to true to prevent future cleanup attempts.
closeErr := s.Close()
if s.cleaned {
if closeErr != nil {
return errors.Trace(closeErr)
}
return nil
}
s.cleaned = true
if s.path == "" {
if closeErr != nil {
return errors.Trace(closeErr)
}
return nil
}
removeErr := os.Remove(s.path)
if removeErr != nil && !os.IsNotExist(removeErr) {
if closeErr != nil {
return errors.Trace(closeErr)
}
return errors.Trace(removeErr)
}
if closeErr != nil {
return errors.Trace(closeErr)
}
return nil| if recordLen > uint64(int(^uint(0)>>1)) { | ||
| return nil, errors.Errorf("large txn spill record is too large: %d", recordLen) | ||
| } |
There was a problem hiding this comment.
Enforcing a limit of math.MaxInt is extremely loose and can still lead to out-of-memory (OOM) panics when allocating memory for data (e.g., make([]byte, int(recordLen))) if the spill file is corrupted or contains invalid data. Since a single KV entry size in TiKV is typically limited to a few megabytes (and at most 128MB), we should enforce a much safer and more reasonable upper bound (e.g., 128MB) to prevent OOM vulnerabilities.
| if recordLen > uint64(int(^uint(0)>>1)) { | |
| return nil, errors.Errorf("large txn spill record is too large: %d", recordLen) | |
| } | |
| const maxRecordLen = 128 * 1024 * 1024 // 128MB | |
| if recordLen > maxRecordLen { | |
| return nil, errors.Errorf("large txn spill record is too large: %d", recordLen) | |
| } |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
[FORMAT CHECKER NOTIFICATION] Notice: To remove the 📖 For more info, you can check the "Contribute Code" section in the development guide. |
What problem does this PR solve?
Issue Number: close #xxx
EventBroker currently scans a whole large upstream transaction into memory before it can send DML to the event collector. Very large transactions can therefore cause OOM, especially when UK-changing updates cache deferred insert rows in memory.
What is changed and how it works?
This PR adds row-level eventstore scan resume tokens and uses them in EventBroker scanning so
transaction-atomicity=nonecan emit bounded fragments from one large transaction without sending resolved-ts for that commit-ts early.It also spills deferred insert halves of UK-changing updates to local disk, drains them after the original delete phase, and cleans spill state on reset/remove. EventCollector reset semantics remain checkpoint-ts based.
Check List
Tests
Questions
Will it cause performance regression or break compatibility?
No compatibility break is expected. The split path is only enabled when transaction atomicity allows splitting. It reduces EventBroker peak memory for large transactions, with local spill I/O for UK-changing update insert halves.
Do you need to update user documentation, design documentation or monitoring documentation?
No user documentation update is required for this internal scan behavior change.
Release note