Target workflow: Daily Agentic Workflow AIC Usage Audit (agentic-token-audit.md)
Selected because: Highest-frequency AI workflow (daily runs); all other AI workflows were optimized within the past 8 days. Fresh evidence from 10 runs reveals a persistent chart-generation gap and prompt structure improvements.
📊 Spend Profile (June 5–12, 2026 · 10 runs analyzed)
| Metric |
Value |
| Runs analyzed |
10 |
| Conclusions |
10/10 success (100%) |
| Total AIC |
N/A (pre-aggregated logs return 0; upstream data gap) |
| Avg agent job duration |
185 s/run (min 132 s, max 241 s) |
| Avg action minutes/run |
~3.6 min |
| Chart uploads (upload_assets) |
2/10 runs (20% success; 80% skipped) |
| Rolling-summary entries |
0 (always empty; trend chart never generated) |
| Cache efficiency |
Not measurable from available data |
Caveat: The gh aw logs export has returned zero AIC values across all 7-day snapshots; AIC spend is proxied by agent job duration. Relative comparisons between recommendations are reliable; absolute AIC numbers are not.
🏆 Ranked Recommendations
1 · Fix chart-generation reliability — upload_assets skipped 8/10 runs
Estimated impact: Qualitative (all 8 affected issues lack charts) · ~0 AIC waste but significant issue-quality loss
Evidence: Across all 10 audited runs, upload_assets succeeded in exactly 2 runs (§27136881236 on June 8, §27017599358 on June 5) and was skipped in all 8 subsequent runs (June 9–12).
Root cause analysis:
rolling-summary.json has 0 entries, so the trend chart is always skipped per the existing instruction "If there are fewer than 2 rolling-summary points, skip the trend chart."
- When all
aic values are 0 (as in the current data-sparse window), the agent also skips the bar chart — no explicit instruction covers this case.
Action: Add two clarifying instructions to Phase 3:
- "Always generate the bar chart (
ai_credits_by_workflow.png) even when all values are 0. Label the x-axis 'No billable AIC in this window' and render zero-height bars. Do not skip the chart."
- "If
rolling-summary.json has 0 entries, skip the trend chart and write a one-sentence note in the issue explaining this (e.g. 'Trend chart omitted: no historical baseline yet.')."
2 · Merge Phase 1 and Phase 2 into a single Python script
Estimated AIC savings: ~0.05–0.08 AIC/run (1–2 tool calls saved per run)
Evidence: The prompt currently instructs the agent to:
- Write a Python script to
/tmp/gh-aw/token-audit/process_audit.py (1 tool call: write file)
- Run the script (1 tool call: bash)
- Read the output file again in Phase 2 to copy it to repo-memory (1 tool call: read + write)
Phases 1 and 2 operate on the same data file. Merging them eliminates one round-trip.
Action: Replace the end of the Phase 1 script spec and all of Phase 2 with a single combined step:
## Phase 1+2 — Process Logs and Persist Snapshot
Write and run a single Python script that:
1. Loads `/tmp/gh-aw/token-audit/workflow-logs.json`, extracts `.runs`, filters to `status == "completed"`.
2. Aggregates per workflow: `run_count`, `total_ai_credits`, `avg_ai_credits`, `total_tokens`, `avg_tokens`, `total_turns`, `avg_turns`, `total_action_minutes`, `error_count`, `warning_count`. Treat null `aic`/`token_usage` as 0.
3. Saves the snapshot to BOTH `/tmp/gh-aw/token-audit/audit_snapshot.json` AND `/tmp/gh-aw/repo-memory/default/YYYY-MM-DD.json` in one write.
4. Loads `/tmp/gh-aw/repo-memory/default/rolling-summary.json`, appends today's overall totals, trims to 90 entries, and saves. Skip the append when `.runs` is empty or no completed runs exist.
3 · Trim the JSON schema field table in "Data Sources"
Estimated AIC savings: ~0.03–0.05 AIC/run (~200–300 prompt tokens removed)
Evidence: The current prompt contains a 15-row markdown table documenting JSON fields such as status, duration, created_at, and run_id — fields that are self-explanatory or unused in the audit logic. Only aic, token_usage, effective_tokens, turns, action_minutes, conclusion, url, and error_count are referenced in subsequent phases.
Action: Replace the full field table with a condensed note:
Key fields per run: `workflow_name`, `aic` (float, nullable → treat as 0), `token_usage` (int, nullable → 0),
`turns`, `action_minutes`, `conclusion`, `status`, `url`, `error_count`, `warning_count`.
Note: `effective_tokens` is deprecated; use `aic` for billing.
This removes ~7 table rows (~150 tokens) without losing any needed context.
4 · Simplify the PYTHONPATH instruction
Estimated AIC savings: ~0.01–0.02 AIC/run (~80 prompt tokens removed)
Evidence: The Phase 3 instruction states "Set PYTHONPATH=/tmp/gh-aw/token-audit/site-packages${PYTHONPATH:+:$PYTHONPATH} for every Python command that imports pandas, matplotlib, or seaborn" plus a lengthy example. This is spelled out three times across the prompt.
Action: Move the PYTHONPATH export to a single shell snippet at the top of Phase 3 and reference it with a short note:
export PYTHONPATH=/tmp/gh-aw/token-audit/site-packages${PYTHONPATH:+:$PYTHONPATH}
Remove the inline example that repeats the same expansion.
5 · Condense the OTEL span block
Estimated AIC savings: ~0.01 AIC/run (~100 prompt tokens removed)
Evidence: The "Experiment OTEL Span Attributes" section embeds 20+ lines of Node.js code inline in the prompt. The code is boilerplate repeated verbatim every run.
Action: Replace the inline code block with a one-line instruction:
If `/tmp/gh-aw/experiments/assignments.json` exists, emit OTEL span attributes using the `otlp.cjs` `logSpan` helper with `gh_aw.experiment.<name>` keys per assignment.
The agent can infer the implementation from the helper documentation.
🔧 Structural Optimization — Inline Sub-Agent for Phase 1 (Data Processing)
Recommendation: Extract Phase 1 (process logs + persist snapshot) into an inline ## agent: block using a haiku-class model.
Why this section fits a smaller model
| Dimension |
Score |
Rationale |
| Independence |
3/3 |
Runs after data download; no outputs needed from other sections |
| Small-model adequacy |
3/3 |
Pure JSON aggregation — no synthesis, no strategy |
| Parallelism |
1/2 |
Sequential (must precede Phase 3+4), but isolated from main context |
| Size |
2/2 |
Substantial enough to justify agent overhead |
| Total |
9/10 |
Strong candidate |
Proposed change
Replace the "Phase 1" and "Phase 2" sections with:
## agent: (haiku) Process Logs and Persist Snapshot
Input: `/tmp/gh-aw/token-audit/workflow-logs.json`
Load the file and extract `.runs`. Filter to `status == "completed"`. Group by `workflow_name` and compute: `run_count`, `total_ai_credits`, `avg_ai_credits`, `total_tokens`, `avg_tokens`, `total_turns`, `avg_turns`, `total_action_minutes`, `error_count`, `warning_count`. Treat null `aic`/`token_usage` as 0.
Save aggregated results (sorted descending by `total_ai_credits`) to:
- `/tmp/gh-aw/token-audit/audit_snapshot.json`
- `/tmp/gh-aw/repo-memory/default/YYYY-MM-DD.json`
Load `/tmp/gh-aw/repo-memory/default/rolling-summary.json`, append today's overall totals (date, total_ai_credits, total_tokens, total_runs, total_action_minutes, active_workflows), trim to 90 entries, and save. **Do not append** when `.runs` is empty or zero completed runs exist.
Why a haiku-class model fits: The task is entirely extractive and formulaic — filter, group, sum, sort, write. No cross-referencing, strategic judgment, or creative output is required. A smaller model handles this pattern reliably and at lower cost.
Estimated AIC savings: ~0.05–0.10 AIC/run (assuming haiku model billing is ~5–10× cheaper than the main model for equivalent token volume in this task).
Full run evidence (10 runs, June 5–12)
Agent duration: min 132 s · max 241 s · avg 185 s · total 1,853 s over 10 runs
⚠️ Caveats
- AIC data unavailable: All pre-aggregated snapshots report 0 AIC and 0 tokens. Savings estimates are based on agent job duration and prompt token analysis, not measured billing data.
- Exclusion window note: This workflow was analyzed on June 9 and June 11. It is selected again because all other AI-powered workflows were also optimized within the past 8 days; this represents the oldest available candidate with measurable run evidence.
- Chart skip cause: The 80% chart skip rate is driven by zero-data logs, not a code defect. Fixes to chart generation will improve issue quality but won't reduce AIC spend.
- Sub-agent savings estimate: The haiku-model savings estimate assumes the sub-agent billing rate is materially lower; verify against actual model pricing before implementing.
Generated by Agentic Workflow AIC Usage Optimizer · ● 13M · ◷
Target workflow: Daily Agentic Workflow AIC Usage Audit (
agentic-token-audit.md)Selected because: Highest-frequency AI workflow (daily runs); all other AI workflows were optimized within the past 8 days. Fresh evidence from 10 runs reveals a persistent chart-generation gap and prompt structure improvements.
📊 Spend Profile (June 5–12, 2026 · 10 runs analyzed)
🏆 Ranked Recommendations
1 · Fix chart-generation reliability —
upload_assetsskipped 8/10 runsEstimated impact: Qualitative (all 8 affected issues lack charts) · ~0 AIC waste but significant issue-quality loss
Evidence: Across all 10 audited runs,
upload_assetssucceeded in exactly 2 runs (§27136881236 on June 8, §27017599358 on June 5) and was skipped in all 8 subsequent runs (June 9–12).Root cause analysis:
rolling-summary.jsonhas 0 entries, so the trend chart is always skipped per the existing instruction "If there are fewer than 2 rolling-summary points, skip the trend chart."aicvalues are 0 (as in the current data-sparse window), the agent also skips the bar chart — no explicit instruction covers this case.Action: Add two clarifying instructions to Phase 3:
ai_credits_by_workflow.png) even when all values are 0. Label the x-axis 'No billable AIC in this window' and render zero-height bars. Do not skip the chart."rolling-summary.jsonhas 0 entries, skip the trend chart and write a one-sentence note in the issue explaining this (e.g. 'Trend chart omitted: no historical baseline yet.')."2 · Merge Phase 1 and Phase 2 into a single Python script
Estimated AIC savings: ~0.05–0.08 AIC/run (1–2 tool calls saved per run)
Evidence: The prompt currently instructs the agent to:
/tmp/gh-aw/token-audit/process_audit.py(1 tool call: write file)Phases 1 and 2 operate on the same data file. Merging them eliminates one round-trip.
Action: Replace the end of the Phase 1 script spec and all of Phase 2 with a single combined step:
3 · Trim the JSON schema field table in "Data Sources"
Estimated AIC savings: ~0.03–0.05 AIC/run (~200–300 prompt tokens removed)
Evidence: The current prompt contains a 15-row markdown table documenting JSON fields such as
status,duration,created_at, andrun_id— fields that are self-explanatory or unused in the audit logic. Onlyaic,token_usage,effective_tokens,turns,action_minutes,conclusion,url, anderror_countare referenced in subsequent phases.Action: Replace the full field table with a condensed note:
This removes ~7 table rows (~150 tokens) without losing any needed context.
4 · Simplify the PYTHONPATH instruction
Estimated AIC savings: ~0.01–0.02 AIC/run (~80 prompt tokens removed)
Evidence: The Phase 3 instruction states "Set
PYTHONPATH=/tmp/gh-aw/token-audit/site-packages${PYTHONPATH:+:$PYTHONPATH}for every Python command that importspandas,matplotlib, orseaborn" plus a lengthy example. This is spelled out three times across the prompt.Action: Move the PYTHONPATH export to a single shell snippet at the top of Phase 3 and reference it with a short note:
Remove the inline example that repeats the same expansion.
5 · Condense the OTEL span block
Estimated AIC savings: ~0.01 AIC/run (~100 prompt tokens removed)
Evidence: The "Experiment OTEL Span Attributes" section embeds 20+ lines of Node.js code inline in the prompt. The code is boilerplate repeated verbatim every run.
Action: Replace the inline code block with a one-line instruction:
The agent can infer the implementation from the helper documentation.
🔧 Structural Optimization — Inline Sub-Agent for Phase 1 (Data Processing)
Recommendation: Extract Phase 1 (process logs + persist snapshot) into an inline
## agent:block using a haiku-class model.Why this section fits a smaller model
Proposed change
Replace the "Phase 1" and "Phase 2" sections with:
Why a haiku-class model fits: The task is entirely extractive and formulaic — filter, group, sum, sort, write. No cross-referencing, strategic judgment, or creative output is required. A smaller model handles this pattern reliably and at lower cost.
Estimated AIC savings: ~0.05–0.10 AIC/run (assuming haiku model billing is ~5–10× cheaper than the main model for equivalent token volume in this task).
Full run evidence (10 runs, June 5–12)
Agent duration: min 132 s · max 241 s · avg 185 s · total 1,853 s over 10 runs