Skip to content

[agentic-token-optimizer] Optimization: Daily Agentic Workflow AIC Usage Audit — chart reliability, script consolidation, and Phase 1 sub-agent #129

@github-actions

Description

@github-actions

Target workflow: Daily Agentic Workflow AIC Usage Audit (agentic-token-audit.md)
Selected because: Highest-frequency AI workflow (daily runs); all other AI workflows were optimized within the past 8 days. Fresh evidence from 10 runs reveals a persistent chart-generation gap and prompt structure improvements.


📊 Spend Profile (June 5–12, 2026 · 10 runs analyzed)

Metric Value
Runs analyzed 10
Conclusions 10/10 success (100%)
Total AIC N/A (pre-aggregated logs return 0; upstream data gap)
Avg agent job duration 185 s/run (min 132 s, max 241 s)
Avg action minutes/run ~3.6 min
Chart uploads (upload_assets) 2/10 runs (20% success; 80% skipped)
Rolling-summary entries 0 (always empty; trend chart never generated)
Cache efficiency Not measurable from available data

Caveat: The gh aw logs export has returned zero AIC values across all 7-day snapshots; AIC spend is proxied by agent job duration. Relative comparisons between recommendations are reliable; absolute AIC numbers are not.


🏆 Ranked Recommendations

1 · Fix chart-generation reliability — upload_assets skipped 8/10 runs

Estimated impact: Qualitative (all 8 affected issues lack charts) · ~0 AIC waste but significant issue-quality loss

Evidence: Across all 10 audited runs, upload_assets succeeded in exactly 2 runs (§27136881236 on June 8, §27017599358 on June 5) and was skipped in all 8 subsequent runs (June 9–12).

Root cause analysis:

  • rolling-summary.json has 0 entries, so the trend chart is always skipped per the existing instruction "If there are fewer than 2 rolling-summary points, skip the trend chart."
  • When all aic values are 0 (as in the current data-sparse window), the agent also skips the bar chart — no explicit instruction covers this case.

Action: Add two clarifying instructions to Phase 3:

  1. "Always generate the bar chart (ai_credits_by_workflow.png) even when all values are 0. Label the x-axis 'No billable AIC in this window' and render zero-height bars. Do not skip the chart."
  2. "If rolling-summary.json has 0 entries, skip the trend chart and write a one-sentence note in the issue explaining this (e.g. 'Trend chart omitted: no historical baseline yet.')."

2 · Merge Phase 1 and Phase 2 into a single Python script

Estimated AIC savings: ~0.05–0.08 AIC/run (1–2 tool calls saved per run)

Evidence: The prompt currently instructs the agent to:

  1. Write a Python script to /tmp/gh-aw/token-audit/process_audit.py (1 tool call: write file)
  2. Run the script (1 tool call: bash)
  3. Read the output file again in Phase 2 to copy it to repo-memory (1 tool call: read + write)

Phases 1 and 2 operate on the same data file. Merging them eliminates one round-trip.

Action: Replace the end of the Phase 1 script spec and all of Phase 2 with a single combined step:

## Phase 1+2 — Process Logs and Persist Snapshot

Write and run a single Python script that:
1. Loads `/tmp/gh-aw/token-audit/workflow-logs.json`, extracts `.runs`, filters to `status == "completed"`.
2. Aggregates per workflow: `run_count`, `total_ai_credits`, `avg_ai_credits`, `total_tokens`, `avg_tokens`, `total_turns`, `avg_turns`, `total_action_minutes`, `error_count`, `warning_count`. Treat null `aic`/`token_usage` as 0.
3. Saves the snapshot to BOTH `/tmp/gh-aw/token-audit/audit_snapshot.json` AND `/tmp/gh-aw/repo-memory/default/YYYY-MM-DD.json` in one write.
4. Loads `/tmp/gh-aw/repo-memory/default/rolling-summary.json`, appends today's overall totals, trims to 90 entries, and saves. Skip the append when `.runs` is empty or no completed runs exist.

3 · Trim the JSON schema field table in "Data Sources"

Estimated AIC savings: ~0.03–0.05 AIC/run (~200–300 prompt tokens removed)

Evidence: The current prompt contains a 15-row markdown table documenting JSON fields such as status, duration, created_at, and run_id — fields that are self-explanatory or unused in the audit logic. Only aic, token_usage, effective_tokens, turns, action_minutes, conclusion, url, and error_count are referenced in subsequent phases.

Action: Replace the full field table with a condensed note:

Key fields per run: `workflow_name`, `aic` (float, nullable → treat as 0), `token_usage` (int, nullable → 0),
`turns`, `action_minutes`, `conclusion`, `status`, `url`, `error_count`, `warning_count`.
Note: `effective_tokens` is deprecated; use `aic` for billing.

This removes ~7 table rows (~150 tokens) without losing any needed context.


4 · Simplify the PYTHONPATH instruction

Estimated AIC savings: ~0.01–0.02 AIC/run (~80 prompt tokens removed)

Evidence: The Phase 3 instruction states "Set PYTHONPATH=/tmp/gh-aw/token-audit/site-packages${PYTHONPATH:+:$PYTHONPATH} for every Python command that imports pandas, matplotlib, or seaborn" plus a lengthy example. This is spelled out three times across the prompt.

Action: Move the PYTHONPATH export to a single shell snippet at the top of Phase 3 and reference it with a short note:

export PYTHONPATH=/tmp/gh-aw/token-audit/site-packages${PYTHONPATH:+:$PYTHONPATH}

Remove the inline example that repeats the same expansion.


5 · Condense the OTEL span block

Estimated AIC savings: ~0.01 AIC/run (~100 prompt tokens removed)

Evidence: The "Experiment OTEL Span Attributes" section embeds 20+ lines of Node.js code inline in the prompt. The code is boilerplate repeated verbatim every run.

Action: Replace the inline code block with a one-line instruction:

If `/tmp/gh-aw/experiments/assignments.json` exists, emit OTEL span attributes using the `otlp.cjs` `logSpan` helper with `gh_aw.experiment.<name>` keys per assignment.

The agent can infer the implementation from the helper documentation.


🔧 Structural Optimization — Inline Sub-Agent for Phase 1 (Data Processing)

Recommendation: Extract Phase 1 (process logs + persist snapshot) into an inline ## agent: block using a haiku-class model.

Why this section fits a smaller model

Dimension Score Rationale
Independence 3/3 Runs after data download; no outputs needed from other sections
Small-model adequacy 3/3 Pure JSON aggregation — no synthesis, no strategy
Parallelism 1/2 Sequential (must precede Phase 3+4), but isolated from main context
Size 2/2 Substantial enough to justify agent overhead
Total 9/10 Strong candidate

Proposed change

Replace the "Phase 1" and "Phase 2" sections with:

## agent: (haiku) Process Logs and Persist Snapshot

Input: `/tmp/gh-aw/token-audit/workflow-logs.json`

Load the file and extract `.runs`. Filter to `status == "completed"`. Group by `workflow_name` and compute: `run_count`, `total_ai_credits`, `avg_ai_credits`, `total_tokens`, `avg_tokens`, `total_turns`, `avg_turns`, `total_action_minutes`, `error_count`, `warning_count`. Treat null `aic`/`token_usage` as 0.

Save aggregated results (sorted descending by `total_ai_credits`) to:
- `/tmp/gh-aw/token-audit/audit_snapshot.json`
- `/tmp/gh-aw/repo-memory/default/YYYY-MM-DD.json`

Load `/tmp/gh-aw/repo-memory/default/rolling-summary.json`, append today's overall totals (date, total_ai_credits, total_tokens, total_runs, total_action_minutes, active_workflows), trim to 90 entries, and save. **Do not append** when `.runs` is empty or zero completed runs exist.

Why a haiku-class model fits: The task is entirely extractive and formulaic — filter, group, sum, sort, write. No cross-referencing, strategic judgment, or creative output is required. A smaller model handles this pattern reliably and at lower cost.

Estimated AIC savings: ~0.05–0.10 AIC/run (assuming haiku model billing is ~5–10× cheaper than the main model for equivalent token volume in this task).


Full run evidence (10 runs, June 5–12)
Run ID Date Agent duration upload_assets Conclusion
§27419010618 2026-06-12 220 s ⏭ skipped ✅ success
§27351174099 2026-06-11 176 s ⏭ skipped ✅ success
§27280081885 2026-06-10 174 s ⏭ skipped ✅ success
§27209129613 2026-06-09 241 s ⏭ skipped ✅ success
§27142277562 2026-06-08 182 s ⏭ skipped ✅ success
§27136881236 2026-06-08 185 s ✅ success ✅ success
§27017599358 2026-06-05 202 s ✅ success ✅ success
§26954882996 2026-06-04 172 s ⏭ skipped ✅ success
§26889992660 2026-06-03 132 s ⏭ skipped ✅ success
§26823769597 2026-06-02 169 s ⏭ skipped ✅ success

Agent duration: min 132 s · max 241 s · avg 185 s · total 1,853 s over 10 runs


⚠️ Caveats

  • AIC data unavailable: All pre-aggregated snapshots report 0 AIC and 0 tokens. Savings estimates are based on agent job duration and prompt token analysis, not measured billing data.
  • Exclusion window note: This workflow was analyzed on June 9 and June 11. It is selected again because all other AI-powered workflows were also optimized within the past 8 days; this represents the oldest available candidate with measurable run evidence.
  • Chart skip cause: The 80% chart skip rate is driven by zero-data logs, not a code defect. Fixes to chart generation will improve issue quality but won't reduce AIC spend.
  • Sub-agent savings estimate: The haiku-model savings estimate assumes the sub-agent billing rate is materially lower; verify against actual model pricing before implementing.

Generated by Agentic Workflow AIC Usage Optimizer · ● 13M ·

  • expires on Jun 19, 2026, 3:46 PM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions