-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Labels
P3: lowNice to have — polish, cleanup, or long-termNice to have — polish, cleanup, or long-termenhancementNew feature or requestNew feature or request
Description
Source
Audit report — Section 9: Phase 3 Roadmap
Description
When a job fails, the only option is manual intervention (inspect, fix, relaunch). There is no automated retry mechanism that can safely rebase the job's work onto the latest base branch and retry.
Proposed Solution
- Retry policy configuration: Per-job or global config for max retries, backoff strategy
- Safe rebase before retry: When retrying, rebase the job's branch onto the latest base/integration branch to pick up any changes from other jobs
- Conflict detection: If rebase conflicts, pause and notify instead of blindly retrying
- Retry context: Pass failure context to the retried agent (what failed, error output) so it can adapt
- Retry limits: Configurable max retries with exponential backoff to prevent infinite loops
Relationship to Other Issues
- Builds on feat: Capture failure artifacts on job failure (terminal output, diff, env) #23 (failure artifact capture) for retry context
- Related to (but distinct from) closed feat: Autopilot graceful degradation with job retry — pause on error instead of hard-fail #39 (autopilot pause on error) — feat: Autopilot graceful degradation with job retry — pause on error instead of hard-fail #39 was about graceful degradation, this is about automated recovery
- Benefits from bug: mc_sync targets HEAD@{upstream} instead of base branch — can no-op or sync wrong ref #32 (sync against local base) for the rebase step
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
P3: lowNice to have — polish, cleanup, or long-termNice to have — polish, cleanup, or long-termenhancementNew feature or requestNew feature or request