Conversation
#2295) * feat(merges): add dismiss actions for failed beads on Merge Queue page - Add individual Dismiss (X) button to each failed bead row in AttentionItemRow - Add bulk 'Dismiss all failed (N)' button to NeedsAttention header area - Fix view button fallback: open MR bead when sourceBead is null (orphaned beads) - Both individual and bulk dismiss call updateBead with status: 'closed' on the MR bead - Dismiss all shows loading spinner and toast on completion/error * fix(merges): use rigId directly in openDrawer to fix TS typecheck --------- Co-authored-by: John Fawcett <john@kilcoode.ai>
* feat(gastown): add town ID copy badge and Debug settings section * fix(gastown): sanitize debug payload — strip git_url credentials and replace git_author_name with presence flag --------- Co-authored-by: John Fawcett <john@kilcoode.ai>
…2318) * chore(gastown): remove dead popReviewQueue and update stale comments Remove popReviewQueue() from review-queue.ts and its wrapper from Town.do.ts — confirmed no callers in the tRPC router, reconciler, or anywhere else. Also remove the Town.do.ts completeReview() wrapper (also had no external callers) and update stale comments across review-queue.ts, Town.do.ts, reconciler.ts, beads.ts, and container-dispatch.ts that referenced old patrol/scheduling functions (feedStrandedConvoys, rehookOrphanedBeads, schedulePendingWork, recoverStuckReviews, witnessPatrol, processReviewQueue) to reflect the current reconciler-based architecture. Closes #1403 * ci: retrigger Kilo Code Review (previous run failed due to transient clone error) * test(gastown): update integration tests to remove removed popReviewQueue/completeReview APIs popReviewQueue() and completeReview() were removed from TownDO as dead code. Update integration tests to use listBeads({ type: 'merge_request' }) instead of popReviewQueue() to observe MR bead state, and remove the regression guard test for completeReview which is no longer testable via the TownDO public API. --------- Co-authored-by: John Fawcett <john@kilcoode.ai>
…em prompt (#2321) * fix(gastown): prevent triage batch bead dispatch loop with wrong system prompt Option A: Mark triage batch bead as in_progress immediately after hookBead() in maybeDispatchTriageAgent(), before awaiting startAgentInContainer(). This prevents reconciler Rule 2 (idle agent + open hooked bead → dispatch_agent) from re-dispatching the triage bead with the generic polecat prompt on the next tick when container start fails. Rule 3 (stale in_progress, 5-min timeout) resets it to open for a clean retry via maybeDispatchTriageAgent. Option B (defense-in-depth): In applyActionCtx.dispatchAgent, detect triage batch beads (gt:triage label + created_by='patrol') and inject the triage system prompt, ensuring the polecat gets the correct tools even if Rule 2 somehow fires against an open triage batch bead. Fixes #1958 * fix(gastown): set rig_id on triage batch bead so reconciler Rule 1 can re-dispatch after timeout Without rig_id, when Rule 3 resets an abandoned in_progress triage batch bead to 'open', Rule 1 skips it (requires rig_id IS NOT NULL). This left the bead permanently 'open', blocking maybeDispatchTriageAgent from creating a replacement. Setting rig_id ensures Rule 1 can re-dispatch the existing bead (with triage system prompt injected via Option B). --------- Co-authored-by: John Fawcett <john@kilcoode.ai>
Add remaining build-essentials packages (cmake, pkg-config) to both prod and dev Dockerfiles. build-essential and libssl-dev were already present.
Install default-jdk (OpenJDK) in both prod and dev Dockerfiles to support Java project builds and runtime.
…ings save (#2366) * fix(gastown): propagate custom env_vars to running containers on settings save Three gaps fixed: 1. syncTownConfigToProcessEnv() now applies custom env_vars from town config to process.env, with tracking of previously-applied keys so removed vars are deleted from process.env. 2. syncConfigToContainer() now persists custom env_vars to TownContainerDO storage (via container.setEnvVar/deleteEnvVar) so they survive container restarts. Previously-persisted custom keys are tracked in DO storage and cleaned up on removal. 3. updateAgentModel() hot-swap now overlays fresh custom env_vars from getCurrentTownConfig() over the stale startupEnv snapshot. Infra keys in LIVE_ENV_KEYS always take precedence. * fix(gastown): guard custom env_vars against reserved key override - control-server: export getLastAppliedEnvVarKeys() for process-manager - process-manager: delete stale custom keys from hotSwapEnv on hot-swap - Town.do: skip RESERVED_ENV_KEYS when setting custom env_vars on container Addresses 3 review warnings about custom env_vars overriding infra keys. * fix: skip reserved env keys in prevCustomKeys cleanup loop prevCustomKeys may contain reserved keys persisted by the previous implementation (before the RESERVED_ENV_KEYS filter was added). Without this guard the cleanup loop would delete managed infra values like KILOCODE_TOKEN that were just written by envMapping. --------- Co-authored-by: John Fawcett <john@kilcoode.ai>
…1403) (#2339) chore(gastown): remove dead GUPP_WARN_MS export and update stale patrol/queue comments - Remove unused GUPP_WARN_MS constant export from patrol.ts (never referenced outside the file) - Update completion-reporter.ts JSDoc: replace stale witnessPatrol/schedulePendingWork references with reconciler-based description - Update control-server.ts comments: replace stale processReviewQueue/recoverStuckReviews references with current TownDO terminology Part of issue #1403 dead code cleanup. Co-authored-by: John Fawcett <john@kilcoode.ai>
* fix(gastown): break create_landing_mr infinite loop (#2260) Add circuit breaker for landing MR creation to prevent runaway retry loops when convoys have no PR URLs. A town accumulated 5,335 failed actions over 41 hours before this fix. - Fix 1: Deduplicate MR bead creation — skip if an open/in_progress landing MR already exists for the convoy - Fix 2: Max 5 landing MR attempts with exponential cooldown (30s base, 30min cap), fail the convoy when exhausted - Fix 3: PR URL validation guard — skip landing MR creation when no tracked beads have a pr_url - Fix 4: Move convoy fail check before update_convoy_progress to prevent the race where progress updates are emitted for convoys about to be failed/closed Store landing_mr_attempts and last_landing_mr_attempt_at in the convoy bead's metadata JSON field (no schema migration needed). Add FailConvoy action type for explicit convoy failure. * fix(gastown): move max-attempts check after landing MR status lookup The max landing MR attempts guard was firing before checking whether the final landing MR was still active or already merged, making the last allowed attempt impossible to succeed. Now we check landing MR status first and only fail the convoy when no landing MR is active or merged. --------- Co-authored-by: John Fawcett <john@kilcoode.ai>
| // If a landing MR is active (open or in_progress), wait for it | ||
| const hasActiveLanding = landingMrs.some( | ||
| mr => mr.status === 'open' || mr.status === 'in_progress' | ||
| if ((convoyBeadsWithPr[0]?.cnt ?? 0) === 0) { |
There was a problem hiding this comment.
CRITICAL: Final landing is blocked for direct-merge convoys
This guard requires at least one tracked bead MR to have a pr_url, but review-then-land intermediate steps in merge_strategy = 'direct' are merged straight into the convoy feature branch and never persist a PR URL. Once those beads finish, the convoy stays open forever because we keep skipping create_landing_mr.
| } | ||
| // Apply current custom env vars, skipping reserved keys. | ||
| for (const [key, value] of Object.entries(customEnvVars)) { | ||
| if (RESERVED_ENV_KEYS.has(key)) continue; |
There was a problem hiding this comment.
WARNING: Custom env vars can overwrite runtime routing vars
/sync-config now copies arbitrary env_vars into process.env, but RESERVED_ENV_KEYS does not include runtime-owned keys like GASTOWN_TOWN_ID and GASTOWN_RIG_ID. A town-level env var with one of those names will clobber the values that the pending-nudge routes and plugin client read from process.env, which breaks callbacks for every running agent in the container.
| const freshCustomKeySet = new Set<string>(); | ||
| if (freshEnvVars !== null && typeof freshEnvVars === 'object' && !Array.isArray(freshEnvVars)) { | ||
| for (const [key, value] of Object.entries(freshEnvVars as Record<string, unknown>)) { | ||
| if (LIVE_ENV_KEYS.has(key)) continue; |
There was a problem hiding this comment.
WARNING: Hot-swap can still override reserved infra env vars
This overlay only skips LIVE_ENV_KEYS, not the full reserved set that buildAgentEnv and /sync-config protect. A custom env var like GASTOWN_API_URL or GASTOWN_SESSION_TOKEN will be copied into hotSwapEnv here during model updates and can restart the SDK server with broken callback credentials or URLs.
Code Review SummaryStatus: 3 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)CRITICAL
WARNING
Other Observations (not in diff)Issues found in unchanged code that cannot receive inline comments:
Files Reviewed (18 files)
Fix these issues in Kilo Cloud Reviewed by gpt-5.4-20260305 · 3,866,793 tokens |
Summary
Batch merge of 9 PRs from
gastown-stagingintomain, covering dispatch reliability, container env propagation, container image tooling, dead code removal, and UI improvements.Bug Fixes
create_landing_mrinfinite loop (#2371) — Fixed the reconciler retrying landing MR creation every 5s without a circuit breaker (one town accumulated 5,335 actions over 41 hours). Added deduplication, max attempts with exponential cooldown (capped at 30min), PR URL validation guard, and a race condition fix. NewFailConvoyaction type for explicit convoy failure with reason tracking.env_varsto running containers on settings save (#2366) — Custom environment variables set in town settings were not reaching already-running container processes until restart + re-dispatch. Three gaps fixed:syncTownConfigToProcessEnv()now writes custom vars toprocess.env(with cleanup tracking),syncConfigToContainer()persists custom vars to DO storage, andupdateAgentModel()hot-swap overlays fresh custom vars over stale startup snapshots.in_progressbeforestartAgentInContainer()to prevent Rule 2 re-dispatch with generic polecat prompt, and (B) defense-in-depth detection indispatchAgentto inject the correct triage system prompt even if Rule 2 fires.Features
default-jdk(OpenJDK) in both prod and dev Dockerfiles to support Java project builds and runtime.cmakeandpkg-configin both prod and dev Dockerfiles for native dependency builds.Cleanup
GUPP_WARN_MSexport, updated stale comments referencing removed functions (witnessPatrol,schedulePendingWork,processReviewQueue,recoverStuckReviews) to describe the current reconciler/alarm-based architecture.popReviewQueueand update stale comments (#2318) — Deleted deadpopReviewQueue()and itsTown.do.tswrapper, removed deadcompleteReview()wrapper, updated stale comments across review-queue, reconciler, beads, and container-dispatch modules.Verification
gastown-stagingVisual Changes
Reviewer Notes
gastown-stagingto batch test interactions before landing onmaincreate_landing_mrloop fix (fix(gastown): break create_landing_mr infinite loop (#2260) #2371) stores retry metadata (landing_mr_attempts,last_landing_mr_attempt_at) in the convoy bead's existingmetadataJSON column — no schema migration neededSettracking in control-server.ts (resets on container restart, which is fine sinceprocess.envalso resets) and DO durable storage for cross-restart persistenceGIT_TOKEN,KILOCODE_TOKEN, etc.) are silently skipped to maintain infra precedence_setfields only)