CHASM retry all tasks on standby until active side has invalidated them#10552
Open
awln-temporal wants to merge 2 commits into
Open
CHASM retry all tasks on standby until active side has invalidated them#10552awln-temporal wants to merge 2 commits into
awln-temporal wants to merge 2 commits into
Conversation
075e4db to
483511f
Compare
645c257 to
67c4356
Compare
yycptt
reviewed
Jun 9, 2026
Member
There was a problem hiding this comment.
hmm changes in this file seems not related, can we split them to a separate PR?
|
|
||
| valid, err := validateChasmSideEffectTask(ctx, ms, task) | ||
| if err != nil || !valid { | ||
| _, err = validateChasmSideEffectTask(ctx, ms, task) |
Member
There was a problem hiding this comment.
I don't recall us discussing about changing side effect task logic on standby side as well but yeah I agree we can make the same change.
However, I don't think it's as simple as just ignoring the valid flag, we only want to ignore "invalid" tasks due to task validator logic change but not "invalid" tasks due to say component not found or corresponding logical tasks not found. The "valid" flag returned today contains both cases.
|
|
||
| return false, nil | ||
| func(_ chasm.NodePureTask, _ chasm.TaskAttributes, _ any) (bool, error) { | ||
| // Any task present means replication has not yet removed it — retry. |
| mutableState, | ||
| task, | ||
| func(node chasm.NodePureTask, taskAttributes chasm.TaskAttributes, task any) (bool, error) { | ||
| ok, err := node.ValidatePureTask(ctx, taskAttributes, task) |
Member
There was a problem hiding this comment.
looks like we can remove ValidatePureTask method.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Retry CHASM tasks on standby indefinitely until Active cluster replication invalidates the logical tasks.
Why
Currently, if the standby cluster runs any CHASM task and none are valid, the physical task is then discarded. To prevent stuck execution cases where code deployments lead to stricter task validation and physical task discarding, we need to keep track of the physical task, only invalidating if the active cluster has completed or dropped it.
How did you test it?