Feature/kiloclaw admin cli admin view and schema#2376
Feature/kiloclaw admin cli admin view and schema#2376evanjacobson wants to merge 52 commits intomainfrom
Conversation
…acking
- Add initiated_by column to kiloclaw_cli_runs (admin/user) with migration
- Create shared createCliRun util in lib/kiloclaw/cli-runs.ts
- Use createCliRun in kiloclaw-router, organization-kiloclaw-router, admin router
- Update cancelKiloCliRun to mark admin runs as cancelled in DB
- Add initiatedBy filter to listAllCliRuns query and UI
- Show Admin badge in CLI runs list for admin-initiated runs
- Remove unused audit log and scrubPrompt from admin router
# Conflicts:
# packages/db/src/migrations/meta/0074_snapshot.json
# packages/db/src/migrations/meta/_journal.json
Return the created run id from admin-started CLI runs and use it to scope admin status and cancel flows to a single persisted row. Also extract shared CLI cancellation persistence so personal, org, and admin paths update terminal state consistently.
Code Review SummaryStatus: 2 Issues Found | Recommendation: Address before merge Overview
Issue Details (click to expand)WARNING
Other Observations (not in diff)None. Files Reviewed (4 files)
Fix these issues in Kilo Cloud Reviewed by gpt-5.4-20260305 · 465,695 tokens |
… related components
…e error messages for consistency
| @@ -0,0 +1,2 @@ | |||
| ALTER TABLE "kiloclaw_cli_runs" ADD COLUMN "initiated_by_admin_id" text;--> statement-breakpoint | |||
| ALTER TABLE "kiloclaw_cli_runs" ADD CONSTRAINT "kiloclaw_cli_runs_initiated_by_admin_id_kilocode_users_id_fk" FOREIGN KEY ("initiated_by_admin_id") REFERENCES "public"."kilocode_users"("id") ON DELETE set null ON UPDATE no action; No newline at end of file | |||
There was a problem hiding this comment.
WARNING: Validating this foreign key can block writes
Adding a foreign key on an existing populated table validates every existing row while holding a lock that conflicts with inserts and updates on kiloclaw_cli_runs. Consider adding the constraint as NOT VALID and validating it in a separate step so deploys do not stall CLI run writes.
There was a problem hiding this comment.
[bot] verify/fix in schema.ts and I will regenerate.
evanjacobson
left a comment
There was a problem hiding this comment.
Submitting pending review to unblock replies.
| const { data } = await kiloCliRunHistory.refetch(); | ||
| const latestRun = selectCurrentCliRun(data?.runs, status.instanceId); | ||
| if (latestRun?.status === 'running') { | ||
| router.push(`/claw/kilo-cli-run/${latestRun.id}`); |
There was a problem hiding this comment.
If there's a run in progress, clicking the Recover with Kilo button will directly route into the run view.
This UX will be improved in a follow up soon after this PR, so this is the MVP as far as providing a decent UX with the enforcement of "one run at a time"
| /> | ||
| <div className="flex items-center gap-3"> | ||
| <Button variant="ghost" size="sm" onClick={() => router.push('/claw')}> | ||
| <Button variant="ghost" size="sm" onClick={() => router.push('/claw/settings')}> |
There was a problem hiding this comment.
Routes the user back to the settings page instead of the chat page
| onOpenChange(false); | ||
| router.push(`/claw/kilo-cli-run/${data.id}`); | ||
| }, | ||
| onError: err => { |
There was a problem hiding this comment.
[bot] we can probably streamline this.
| ...trpc.admin.kiloclawInstances.getKiloCliRunStatus.queryOptions( | ||
| selectedRunId | ||
| ? { userId, instanceId, runId: selectedRunId } | ||
| : { userId, instanceId, runId: '00000000-0000-0000-0000-000000000000' } |
There was a problem hiding this comment.
[bot] why are we manually setting runId here?
| <DetailField label="Status"> | ||
| <Badge | ||
| className={ | ||
| runStatus.status === 'running' |
There was a problem hiding this comment.
4x nested ternary is the most I've personally witnessed.
| .notNull() | ||
| .references(() => kilocode_users.id, { onDelete: 'cascade' }), | ||
| instance_id: uuid().references(() => kiloclaw_instances.id), | ||
| initiated_by_admin_id: text().references(() => kilocode_users.id, { onDelete: 'set null' }), |
There was a problem hiding this comment.
added to kiloclaw_cli_runs
| const parsed = StartRunBodySchema.safeParse(body); | ||
| if (!parsed.success) { | ||
| return c.json( | ||
| { error: 'Invalid request body', details: z.treeifyError(parsed.error) }, |
There was a problem hiding this comment.
error.flatten() is deprecated in favor of z.treeifyError(error)
| const status = (err as { status: number }).status; | ||
| if (status >= 400 && status < 600) return status; | ||
| // Extract a valid HTTP status from the error or its cause, defaulting to 500. | ||
| for (const candidate of [err, err instanceof Error ? err.cause : undefined]) { |
There was a problem hiding this comment.
this is just a refactor for the sake of clarity
| typeof (err as { code?: unknown }).code === 'string' | ||
| ? (err as { code: string }).code | ||
| : undefined; | ||
| : typeof err === 'object' && |
There was a problem hiding this comment.
[bot] can we leverage a helper method here? Or at least explain to me what's happening
- Revert unnecessary InstanceControls componentization - Add undefined guard on createCliRun row insert - Document baseProcedure choice on listKiloCliRuns - Add comment explaining placeholder UUID in KiloCliRunCard - Extract getErrorCode helper in platform.ts to replace deep as-cast chain
| </div> | ||
| {run.initiated_by_admin_id && ( | ||
| <p className="text-muted-foreground text-xs"> | ||
| Initiated by {run.initiated_by_admin_email ?? 'Admin'} |
There was a problem hiding this comment.
This is useful info at a glance for us internally. Will not be available for users.
…rmatting and detail field
…ancellation and status retrieval
| await markCliRunCancelled({ | ||
| runId: params.runId, | ||
| userId: params.userId, | ||
| instanceId: params.instanceId, |
There was a problem hiding this comment.
WARNING: Successful cancel will not update historical instance-scoped rows
When the caller has no active personal instance, cancelCliRun() can now find a destroyed-instance run via the relaxed lookup, but params.instanceId is still null. markCliRunCancelled() treats null as instance_id IS NULL, so this update misses the matched row and leaves it marked running in Postgres even after the worker confirms cancellation.
| await persistCliRunControllerStatus({ | ||
| runId: params.runId, | ||
| userId: params.userId, | ||
| instanceId: params.instanceId, |
There was a problem hiding this comment.
WARNING: Terminal status still won't persist for historical runs
The same params.instanceId value is reused here after loading a destroyed-instance row. In the no-active-instance fallback that value remains null, so persistCliRunControllerStatus() updates only instance_id IS NULL rows and skips the matched historical run. That leaves the DB row stuck at running after the controller reports completion or failure.
Summary
Recover with Kilo was missing the basic safeguards and visibility needed to be a reliable support workflow: admins could not cleanly initiate recovery runs for users, concurrent starts could race, and users had no dependable way to get back to an in-progress run instead of accidentally starting another. This phase 1 tightens that loop so recovery runs behave like a single tracked operation per instance and are visible from the places operators and users already expect.
This change adds admin-initiated Kilo CLI recovery runs, persists run metadata so status can be correlated across app and controller state, and surfaces in-progress runs from the user flow via the existing button path. It follows the existing KiloClaw patterns of instance-scoped routing, DB-backed operational metadata, admin audit logging, and controller/platform conflict mapping rather than introducing a new recovery-specific workflow shape.
Verification
Visual Changes
Reviewer Notes
Known Limitations