capabilities: retain don2don request IDs longer#21677
capabilities: retain don2don request IDs longer#21677prashantkumar1982 wants to merge 1 commit intodevelopfrom
Conversation
|
👋 prashantkumar1982, thanks for creating this pull request! To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team. Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks! |
|
✅ No conflicts with other open PRs targeting |
|
I see you updated files related to
|
cc02553 to
b48b33b
Compare
|
|
|
||
| for requestID, executeReq := range r.requestIDToRequest { | ||
| if executeReq.request.Expired() { | ||
| if executeReq.request.Evictable(commoncap.DefaultExecutableRequestTimeout) { |
There was a problem hiding this comment.
What does this correspond to in practice? commoncap.DefaultExecutableRequestTimeout
IMO we should keep these requests around for minutes, ~5/10 or so; I don't think this will be big enough




Summary
We've seen in production that some lagging nodes can send the same Don2Don request ID well after the earlier copies of that request have already been cleaned up on the capability DON.
When that happens, the later message is treated like a fresh request instead of being recognized as a duplicate of an old one. That creates confusing follow-on errors in the stack because we lose the original dedup context too early.
This change increases the eviction window for executable Don2Don request IDs so we retain old request state for longer and get a better error surface when delayed duplicates arrive.
What this change does
DefaultExecutableRequestTimeoutmax(requestTimeout, DefaultExecutableRequestTimeout)instead of onlyrequestTimeout