Captain is a Go CLI for working with Claude Code sessions, hooks, and sandboxes.
It provides tools to:
- inspect Claude Code project/session history
- summarize tool usage, paths, binaries, and API cost
- install Claude Code hooks
- enforce a session-specific Definition of Done gate
- test and use AI providers from the command line
- generate/build/run containerized Claude Code sandboxes
- inspect and clean stored Claude project session data
From the codebase, Captain is organized around a few core capabilities:
Captain reads Claude session history and exposes commands for:
history— inspect tool usage from Claude sessionsinfo— show project/session metadata for the current directorycost— estimate and group token/API usage across sessionsprojects list— list tracked Claude projects and sessionsprojects clean— remove old session history
It analyzes:
- tools used
- files/directories read and written
- domains accessed
- binaries executed
- approval / denial state
- categories of bash activity
- token and cost summaries
Captain can install hook commands into Claude settings for:
- PreToolUse bash scanning via
hook bash-check - Stop hook gating via
hook dod install
The bash-check hook scans bash commands and can deny unsafe or disallowed commands.
Captain supports a per-session Definition of Done workflow:
dod set— attach one or more validation commands to a Claude sessiondod check— intended for Claude Stop hooksdod run— manually execute DoD checksdod status— show current DoD config/resultsdod clear— remove the DoD gate
This lets Claude continue iterating until required checks pass.
Captain includes provider-agnostic AI utilities under captain ai:
ai prompt— send a prompt to a selected backend/modelai models— list model informationai test— verify provider connectivityai fixture— run a YAML benchmark fixture across multiple Claude configurations and capture a markdown evidence report
Supported backends are inferred from code and dependencies, including:
- Anthropic
- Gemini / Google
- OpenAI-compatible paths via the internal provider layer
- CLI/provider abstractions for local tool-backed backends
Captain can discover Claude-related local configuration and package it into a container sandbox.
Supported workflows include:
container— interactive TUIcontainer list— list discovered componentscontainer generate— generate Dockerfile/build context and sandbox configcontainer build— build the sandbox imagecontainer run— run the generated sandboxsandbox presets— list available sandbox-runtime presets
The container workflow is designed to package things like:
- Claude config
- agents
- commands
- hooks
- MCP server config
- project settings
- token/env passthrough
- sandbox-runtime presets for languages/tools
captain/
├── cmd/captain/ # CLI entrypoint
├── pkg/ai/ # AI abstraction, provider config, models
├── pkg/ai/fixture/ # YAML fixture runner for Claude configuration benchmarks
├── pkg/bash/ # Bash scanning, classification, rules
├── pkg/claude/ # Claude history, sessions, parsing, formatting
├── pkg/cli/ # Cobra/clicky command implementations
├── pkg/container/ # Sandbox discovery, generation, build/run logic
├── pkg/dod/ # Definition of Done persistence and execution
├── pkg/sandbox/ # Token/preset/sandbox helpers
├── Dockerfile # Container image for captain/Claude tooling
├── entrypoint.sh # gosu-based user switching entrypoint
├── Makefile # Thin wrapper around Taskfile
└── Taskfile.yaml # Main developer tasks
Top-level commands exposed by cmd/captain/main.go:
captain history
captain info
captain cost
captain sandbox
captain ai
captain dod
captain hook
captain projects
captain containercaptain history
captain history --summary
captain history --all
captain history --tool Bash --since now-7d
captain history --category git --compactUseful flags include:
--file--tool--dir--category--approved--limit--since--all--short--compact--summary
captain info
captain info --path /path/to/projectShows project root detection, Claude project directory, session counts, history range, and tool call totals.
captain cost
captain cost --group-by project
captain cost --group-by model
captain cost --group-by tool
captain cost --group-by category
captain cost --all --since now-30dSupported groupings from the code:
sessionprojectmodeldaydirfiletoolcategory
Install the bash safety hook:
captain hook bash-check install
captain hook bash-check install --userInstall the DoD stop hook and related skill files:
captain hook dod install
captain hook dod install --usercaptain dod set --session-id <session-id> "go test ./..." "golangci-lint run"
captain dod status --session-id <session-id>
captain dod run --session-id <session-id>
captain dod clear --session-id <session-id>captain ai prompt --model claude-sonnet-4 --prompt "Summarize this diff"
captain ai test --model gemini-2.0-flash
captain ai models
captain ai fixture --file examples/ai-fixtures/mission-control-investigate.yamlRelevant provider flags include:
--model--backend--api-key--no-cache--budget--debug
captain ai fixture runs the same prompt against multiple Claude
configurations (different models, tool allowlists, MCP servers, prompt
caching on/off) and prints a side-by-side table of duration, cost,
tokens, and tool-call counts. It's intended to produce evidence that
one approach — e.g. a structured MCP — is faster and cheaper than a
Bash/CLI equivalent.
captain ai fixture -f examples/ai-fixtures/mission-control-investigate.yaml
captain ai fixture -f examples/ai-fixtures/mission-control-describe.yaml --report /tmp/mc-describe.md
captain ai fixture -f examples/ai-fixtures/mission-control-multistep.yaml --repeat 5Flags:
--file/-f— path to the YAML fixture (required)--report/-r— write an evidence report (headline, metrics table, per-run config, tool-usage breakdown) to this path--format— report format:markdown(default),html, oransi--artifacts— directory for per-runstream-jsoncaptures (default:<fixture-dir>/.captain/fixtures/<name>/)--repeat— override every run's repeat count (useful for smoke tests:--repeat 1)
YAML schema (abridged):
name: my-benchmark
description: What you're measuring and why
prompt: |
The prompt sent to every run (can be overridden per-run).
baseline: direct # which run to compare against for Speedup/Cheaper ratios
repeat: 3 # default N per run; overridable per-run and via --repeat
defaults:
timeout: 3m
permissionMode: bypassPermissions
promptCaching: true
model: claude-sonnet-4
runs:
- name: direct
tools: [Bash]
allowedTools: ["Bash(kubectl *)", "Bash(aws *)"]
- name: mission-control
tools: [default]
mcpConfig: [.mcp.json]
allowedTools: ["mcp__mission-control__*"]
repeat: 5 # overrides fixture-level repeat for this runTwo rules the runner enforces for you so direct-vs-MCP comparisons stay honest — both are automatic, no extra flags needed:
- MCP is opt-in per run. A run gets MCP servers only when
mcpConfigis set. With nomcpConfig, the runner passes--strict-mcp-configwith an empty inline config, so ambient.mcp.jsonin the fixture directory and user-level MCP servers are never picked up. allowedToolsis treated as a real allowlist. Claude CLI's--allowedToolsis natively an auto-approve list, not a restriction — underbypassPermissionsthe model can still reach for anything. When a run specifiesallowedTools, the runner demotes the effective--permission-modefrombypassPermissionstodefaultso unlisted tools are denied in non-interactive mode. If you setpermissionModeto anything other thanbypassPermissionsexplicitly, your choice is preserved. Runs withoutallowedToolskeep whatever permission mode they asked for.
Practical consequence for a direct-vs-MCP fixture: the direct run with
allowedTools: [Bash(kubectl *), ...] can only shell out to those
patterns; the mission-control run with
allowedTools: [mcp__mission-control__*] can only use MCP — Bash is off
even though it's a built-in. Neither run can accidentally borrow from
the other's toolset.
Supported per-run fields: name, prompt, system, model, timeout,
cwd, permissionMode, appendSystemPrompt, settings,
maxBudgetUSD, repeat, tools, allowedTools, disallowedTools,
mcpConfig, addDir, betas, extraArgs, env, promptCaching,
noSessionPersistence, bare. See
examples/ai-fixtures/ for working benchmarks.
Repeats (repeat: N) execute each run N times and report the mean
duration/cost plus a sample standard deviation — single-shot LLM numbers
are noisy and N≥3 makes comparisons defensible. The raw per-iteration
stream-json is saved under the artifacts directory so every claim in
the report is reproducible.
Set captureKubernetesProxy: true at the fixture level to route every
kubectl call made during the fixture through a captain-managed reverse
proxy, and record both layers of activity:
captureKubernetesProxy: true
kubeconfig: ~/.kube/config # optional; defaults to client-go discoveryWhen enabled, the runner:
- starts a localhost reverse proxy that loads the user's kubeconfig (auth plugins included) and forwards to the real cluster
- generates a temp kubeconfig pointing at the proxy and injects
KUBECONFIG=<that path>into every run's environment, so kubectl can't bypass it - writes a JSONL log per run/iteration to
<artifacts>/<run>-<iter>.kubectl.jsonlwith two record types:{"type":"command","command":"kubectl get pods -n prod"}— literal CLI invocation parsed from the model's Bash tool calls{"type":"request","method":"GET","path":"/api/v1/...","status":200}— every API call observed by the proxy
- surfaces a Kubectl activity section in the report with per-run CLI and API counts plus a few sample commands
captain container
captain container list
captain container generate
captain container generate -i
captain container build --preset golang
captain container run
captain sandbox presetsImportant generate/build flags:
--interactive--preset--base--mode copy|mount
This repo uses task as the main task runner.
task build
# or
make buildBinary output:
.bin/captain
task test
# or
make testtask linttask installBy default this copies the built binary to:
/usr/local/bin/captain
The included Dockerfile builds on flanksource/base-image and installs:
- Node.js
- git, gh, jq, vim, nano, zsh, fzf, etc.
- Claude Code via
@anthropic-ai/claude-code
The image is set up to:
- create a user matching host UID/GID
- switch execution using
gosu - use
/workspaceas the working directory
Primary stack:
- Go 1.25.8
- Cobra for CLI wiring
- clicky for formatting/output/flag binding
- sandbox-runtime for sandbox preset handling
- AI SDKs for Anthropic, OpenAI, and Gemini/Google
- shell parsing via
mvdan.cc/sh/v3
This README reflects the current code layout.
At the time of generation, go test ./... in this checkout does not fully pass. Observed issues included:
- a build failure around
pkg/claude/tools - failing container tests
- AI integration tests requiring valid external provider credits/config
So treat this repository as active/in-progress rather than guaranteed green in the current local state.
cd captain
make build
.bin/captain info
.bin/captain history --summary
.bin/captain container listIf you want to use hooks:
.bin/captain hook bash-check install --user
.bin/captain hook dod install --user- Captain is tightly focused on Claude Code workflows.
- It is both an analysis tool and an execution/control tool.
- The container/sandbox functionality is a major part of the project, not a side feature.
- Many commands assume the presence of Claude local state under the user’s Claude config/projects directories.