-
Notifications
You must be signed in to change notification settings - Fork 134
fix(gas): fix loop forgotten bug due to concurrency #3482
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: 11-17-fix_pb_rewrite_runner_wf_to_handle_batch_signals
Are you sure you want to change the base?
fix(gas): fix loop forgotten bug due to concurrency #3482
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
3 Skipped Deployments
|
More templates
@rivetkit/cloudflare-workers
@rivetkit/db
@rivetkit/framework-base
@rivetkit/next-js
@rivetkit/react
rivetkit
@rivetkit/sql-loader
@rivetkit/engine-runner
@rivetkit/engine-runner-protocol
commit: |
PR Review: fix(gas): fix loop forgotten bug due to concurrencySummaryThis PR fixes a critical concurrency bug in the gasoline workflow engine where loop event history was being incorrectly cleared, potentially losing events from concurrent iterations. Core Fix Analysis (engine/packages/gasoline/src/db/kv/mod.rs)The Problem: The Solution:
This ensures only past iterations are moved to forgotten history, preserving the current iteration's events. Code Quality: ✅ Excellent
Other Changes Review1. pegboard-runner/src/conn.rs - Workflow migration-workflow(pegboard::workflows::runner::Input {
+workflow(pegboard::workflows::runner2::Input {Concern: Questions:
2. pegboard/src/workflows/runner2.rs - Init signal handling-if init.is_none() {
- init = Some(init_sig);
- check_queue = true;
-}
+init = Some(init_sig);
+check_queue = true;Concern: Potential Issue:
Recommendation: If this change is intentional (e.g., supporting re-initialization), add a comment explaining why. Otherwise, consider keeping the guard or at least logging when init is being overwritten. 3. Cargo.toml - Commented release profile options+# strip = true
+# panic = "abort"
+# overflow-checks = false
+# debug-assertions = falseIssue: ❌ These commented-out options should not be in the PR Recommendation: Remove these lines. They appear to be experimental/debugging configurations that were accidentally included. 4. scripts/docker/build-push.sh - Commented build arg+# --build-arg CARGO_BUILD_MODE=release \Issue: ❌ Commented code in scripts Recommendation: Remove if not needed, or uncomment if intended to be active. 5. scripts/run/docker/engine-rocksdb.sh - Trailing newlineIssue: This is inconsequential but typically files should end with a newline per POSIX standards. Security Considerations✅ No security concerns identified. The fix actually improves data integrity by preventing unintended data loss. Performance Considerations✅ Improved Performance: The bounded range query is more efficient than the original unbounded subspace range, as it scans fewer keys. Test CoverageConcerns:
Recommendations:
Example test structure: #[workflow(ConcurrentLoopTestWorkflow)]
pub async fn concurrent_loop_test(ctx: &mut WorkflowCtx, input: &Input) -> Result<()> {
ctx.loope(0, move |ctx, state| {
async move {
// Trigger some events during iteration
ctx.activity(...).await?;
if *state >= 5 {
return Ok(Loop::Break(()));
}
*state += 1;
Ok(Loop::Continue)
}.boxed()
}).await?;
// Verify all iteration events are in forgotten history
Ok(())
}Summary of RecommendationsMust Fix:
Should Address:
Nice to Have:
Overall AssessmentThe core bug fix is excellent and addresses a real concurrency issue with a correct, performant solution. However, the PR includes several unrelated or questionable changes that should be addressed before merging. Recommendation: Request changes to clean up the non-essential modifications and clarify the intentional behavior changes in runner2.rs. |
4eabd07 to
e68ccc5
Compare
1a4f15a to
4294775
Compare
Code ReviewSummaryThis PR fixes a critical concurrency bug in the workflow loop history management where events from all loop iterations were being moved to forgotten history, not just events from completed iterations. The fix correctly bounds the range query to only include events up to the current iteration. ✅ Positives
🔍 Issues & ConcernsCritical: Missing Key Deletion (
|
PR Review: Fix loop forgotten bug due to concurrencySummaryThis PR addresses a critical bug in the workflow engine's loop event history management. The fix ensures that when moving loop events to forgotten history, only events from iterations 0 through (current iteration - 1) are moved, preventing concurrent loop iterations from incorrectly archiving each other's events. Code Quality & Best Practices ✅Main Fix (gasoline/src/db/kv/mod.rs:2579-2634)Good:
Observation:
Workflow Fix (pegboard/src/workflows/runner2.rs:103-108)Good:
Potential Issue: Question: Is this the intended behavior? Consider these scenarios:
Suggestion: Consider either:
Potential Bugs & Issues1. Edge Case: iteration == 0
|
e68ccc5 to
9a87fd1
Compare
4294775 to
5458e50
Compare
Code ReviewSummaryThis PR fixes a critical bug in the workflow loop event history management where events from all iterations (including future ones) were being moved to forgotten history instead of only past iterations. The fix also includes some related cleanup in the pegboard runner workflow. Critical Bug Fix ✅Location: The Problem: The Solution:
This ensures only past iterations are moved to forgotten history, which is the correct behavior. Code Quality: ⭐⭐⭐⭐⭐
Secondary Changes1. Pegboard Runner Init Logic (
|
PR Review: Fix loop forgotten bug due to concurrencyOverviewThis PR fixes a critical bug in the workflow engine's loop history management where concurrent operations could cause events from ALL iterations of a loop to be moved to forgotten history, rather than just the events from completed iterations up to the current one. Code Quality & ImplementationMain Fix (gasoline/src/db/kv/mod.rs) ✅ The core fix correctly addresses the concurrency issue by:
The comments are also improved, making the intent clearer ("up to the current iteration" instead of just "current events"). Potential Issues & Concerns1. Minor: Iteration = 0 Edge Case When iteration.saturating_sub(1) // = 0 when iteration is 0This means the range would be
Recommendation: Add a guard condition: if iteration > 0 {
// Only move events to forgotten if we have completed iterations
let loop_events_subspace_start = ...
// ... rest of the code
}2. Configuration Changes (Cargo.toml) The commented-out release profile optimizations are concerning: # strip = true
# panic = "abort"
# overflow-checks = false
# debug-assertions = false
Recommendation: Remove these lines or add a comment explaining their purpose. 3. Unrelated Changes Several changes appear unrelated to the loop forgotten bug:
Recommendation: Consider splitting these changes into separate commits or PRs for clearer history. If they're related to reproducing/testing the bug, please document that relationship. Performance Considerations✅ Good: The fix is more efficient than the original - it uses precise range bounds instead of iterating over all loop events. Security Concerns
Test Coverage❌ Missing: No test changes are included in this PR. Recommendations:
Additional ObservationsLogging: Consider adding structured logging around the forgotten history logic: tracing::debug!(
?from_workflow_id,
?location,
iteration,
"moving loop events to forgotten history"
);This would help diagnose similar issues in the future. SummaryStrengths:
Action Items:
Overall, this is a good fix for a critical bug, but needs test coverage before merging. |
5458e50 to
8431b53
Compare
9a87fd1 to
0560d9b
Compare
8431b53 to
5972ed8
Compare
PR Review: fix(gas): fix loop forgotten bug due to concurrencySummaryThis PR fixes a concurrency bug in the gasoline workflow engine's loop forgotten history handling and includes a few related changes. 🚨 Critical Issue: Broken Build ScriptFile: The shell script has a syntax error that will break the docker build -f "${DOCKERFILE}" --target "${TARGET}" \
--platform linux/x86_64 \
--build-arg BUILD_FRONTEND=true \
# --build-arg CARGO_BUILD_MODE=release \ # ← This breaks the script!
"${BUILD_TAG_ARGS[@]}" \
"${CONTEXT}"In bash, a Recommendation: Either remove the commented line entirely, or use a different approach: docker build -f "${DOCKERFILE}" --target "${TARGET}" \
--platform linux/x86_64 \
--build-arg BUILD_FRONTEND=true \
"${BUILD_TAG_ARGS[@]}" \
"${CONTEXT}"✅ Core Bug Fix: Loop Forgotten History (
|
PR Review: fix(gas): fix loop forgotten bug due to concurrencySummaryThis PR fixes a bug in the gasoline workflow engine's loop event handling where events from the current iteration could be incorrectly moved to forgotten history due to a concurrency issue. 🔴 Critical Issue: Broken Shell ScriptFile: The shell script has a syntax error that will cause the build to fail: docker build -f "${DOCKERFILE}" --target "${TARGET}" \
--platform linux/x86_64 \
--build-arg BUILD_FRONTEND=true \
# --build-arg CARGO_BUILD_MODE=release \ # <-- This breaks the command!
"${BUILD_TAG_ARGS[@]}" \
"${CONTEXT}"In bash, Recommendation: Either remove the commented line entirely or move it outside the command: # Uncomment for release builds: --build-arg CARGO_BUILD_MODE=release
docker build -f "${DOCKERFILE}" --target "${TARGET}" \
--platform linux/x86_64 \
--build-arg BUILD_FRONTEND=true \
"${BUILD_TAG_ARGS[@]}" \
"${CONTEXT}"
|
ff15327 to
f0e981f
Compare
5972ed8 to
8a61739
Compare
Pull Request Review: fix(gas): fix loop forgotten bug due to concurrencySummaryThis PR fixes a critical concurrency bug in the gasoline workflow engine's loop event history management. The main issue was that when moving loop events to forgotten history during iteration, the code was clearing all loop events instead of just those up to the current iteration. This could cause data loss and race conditions when concurrent operations were happening. Code Quality & Best Practices✅ Strengths:
Potential Bugs🔴 Critical Fix - Correctly Addressed:
This ensures that events from the current iteration and beyond are not prematurely deleted. let loop_events_subspace_end = self
.subspace
.subspace(&keys::history::EventHistorySubspaceKey::new(
from_workflow_id,
location.clone(),
iteration.saturating_sub(1), // Line 2595
false,
))
.range()
.1;When Performance Considerations✅ Good:
Suggestion: Security Concerns✅ No Major Issues:
Test Coverage❌ Missing:
Recommendations:
Additional Changesrunner2.rs Changes: // OLD:
if init.is_none() {
init = Some(init_sig);
check_queue = true;
}
// NEW:
init = Some(init_sig);
check_queue = true;Question: Is this change intentional and related to the concurrency fix? This could allow re-initialization which might be desired behavior, but it's not clear from the PR description. Please clarify the reasoning. pegboard-runner/conn.rs: Recommendations
VerdictOverall Assessment: ✅ Approve with minor suggestions The core fix addresses a real concurrency bug correctly. The range-based clearing approach is the right solution. However, the PR would benefit from:
The fix itself is sound and should prevent the data loss issue caused by the previous implementation. |
PR Review: fix(gas): fix loop forgotten bug due to concurrencySummaryThis PR fixes a critical concurrency bug in the workflow engine's loop event management where events from concurrent loop iterations could be incorrectly moved to forgotten history. Code Quality & Best Practices✅ Strengths
|
8a61739 to
a5ec483
Compare
f0e981f to
f7de5c8
Compare
PR Review: fix(gas): fix loop forgotten bug due to concurrencySummaryThis PR addresses a critical bug in the gasoline workflow engine's loop event handling where concurrent loop iterations could cause events from the current iteration to be incorrectly moved to forgotten history. Code Quality & Best Practices✅ Strengths
|
Code ReviewI've reviewed this PR which fixes a loop forgotten bug due to concurrency in the gasoline workflow engine. Here's my analysis: Core Fix: Loop Event Range Selection (engine/packages/gasoline/src/db/kv/mod.rs)Positive:
Issues & Suggestions:
Pegboard Runner Changes (engine/packages/pegboard/src/workflows/runner2.rs)Line 105-108: init = Some(init_sig);
check_queue = true;Issue: The removal of the
Recommendation: Add a check or comment explaining why it's safe to overwrite init, or add error handling if this shouldn't happen. Commented Code IssuesCargo.toml (lines 436-439): # strip = true
# panic = "abort"
# overflow-checks = false
# debug-assertions = falseCritical: These commented-out profile settings should not be committed to the codebase:
scripts/docker/build-push.sh (line 29): # --build-arg CARGO_BUILD_MODE=release \Issue: Commented build arg in the middle of a command. This suggests:
scripts/run/docker/engine-rocksdb.sh (line 18):
Security & Best PracticesNo security concerns identified - The changes are internal database operations and workflow state management. Best practices observations:
Test CoverageRecommendation: This PR fixes a concurrency bug. Consider adding:
SummaryApproval recommendation: The core bug fix is sound and addresses a real concurrency issue. However:
Once the commented code is cleaned up and the init overwrite is clarified, this should be good to merge. |
f7de5c8 to
6eecd1a
Compare
a5ec483 to
9893845
Compare
Code Review - PR #3482: fix(gas): fix loop forgotten bug due to concurrencySummaryThis PR fixes a critical concurrency bug in the loop event history management system where events from all loop iterations were being moved to forgotten history, rather than just events up to the current iteration. Code Quality & Best Practices✅ Strengths
|
Code Review - PR #3482: fix(gas): fix loop forgotten bug due to concurrencySummaryThis PR fixes a critical concurrency bug in the Gasoline workflow engine where loop events were being incorrectly moved to forgotten history. The main issue was that the range query was inadvertently including events from all loop iterations, when it should only include events up to the current iteration. Critical Fix: Loop Forgotten Bug (gasoline/src/db/kv/mod.rs)Issue Fixed: Solution:
This ensures only events up to (but not including) the current iteration are moved to forgotten history. Code Quality: ✅ Excellent
Secondary Fix: Runner Init Signal Handling (pegboard/src/workflows/runner2.rs)Change: // Before:
if init.is_none() {
init = Some(init_sig);
check_queue = true;
}
// After:
init = Some(init_sig);
check_queue = true;Analysis:
Recommendation:
Minor Change: Workflow Name Update (pegboard-runner/src/conn.rs)Change: Analysis: ✅ Straightforward migration to new runner version Commented Out CodeFiles with commented code:
Recommendations:
Test Coverage❓ Question:
Performance Considerations✅ Positive:
Security Considerations✅ No security concerns identified
Summary of Recommendations
VerdictOverall Assessment: ✅ Approve with suggestions The core fix for the loop forgotten bug is solid and addresses a critical concurrency issue. The implementation is clean and well-commented. The secondary changes need minor clarification, and test coverage should be considered. What's Good:
What Needs Attention:
|
6eecd1a to
8c26dca
Compare
9893845 to
7dad582
Compare

No description provided.