Summary
The Copilot Token Usage Optimizer consumed 14.9M tokens in its single run on April 18 — the most token-intensive workflow that day (see discussion #27103). This is ironic for a workflow whose job is to reduce token usage in other workflows.
Current State
| Metric |
Value |
| Tokens (single run) |
14,886,509 |
| Error rate |
1.00/run (failed to file issue due to port 8080 firewall — fixed in #27080) |
| Actions minutes |
37 min |
| Prompt size |
~10KB (300 lines) |
| GitHub toolset |
[default] (~22 tools) |
Root Causes
1. Overly broad GitHub toolset
Uses toolsets: [default] which loads ~22 GitHub tools. The workflow only needs to read workflow source files and create issues. Each tool schema adds ~2.5-3K tokens to the system prompt, repeated on every agent turn.
2. No cli-proxy — uses GitHub MCP for file reads
The workflow reads target workflow .md files via GitHub MCP tools (get_file_contents). Enabling cli-proxy would let the agent use the gh CLI directly (e.g., gh api repos/.../contents/...), eliminating the need for the full GitHub MCP toolset.
3. Verbose prompt with inline bash examples
The prompt is 300 lines / 10KB with multiple inline bash code blocks that the agent copies verbatim. These examples consume input tokens on every turn. The prompt can be condensed to ~167 lines / 7KB (30% reduction) by replacing verbose examples with concise instructions.
4. mount-as-clis: true adds overhead
This feature mounts MCP servers as CLI wrappers, adding unnecessary complexity. Can be removed.
5. copilot-requests: true is expensive and unnecessary
This feature flag is not needed for a data analysis workflow. Removing it reduces overhead.
6. No pre-aggregation of data
The agent spends many turns running jq queries to aggregate workflow data. Adding a bash pre-step to compute top-10 workflows by token usage would eliminate ~5 turns of data gathering.
7. Unnecessary shared/mcp/gh-aw.md import
This shared component installs and copies the gh-aw binary for MCP server containerization. With cli-proxy enabled, the agent can use gh aw directly and this import is not needed.
Recommended Changes
R1: Narrow GitHub toolset to [issues]
The workflow only creates issues via safe-outputs. It does not need repos, PRs, or code search tools.
# Before
tools:
mount-as-clis: true
github:
toolsets: [default]
# After
tools:
github:
toolsets: [issues]
Estimated savings: ~50-60K tokens/turn × many turns
R2: Enable cli-proxy and remove shared/mcp/gh-aw.md
Use the gh CLI for reading workflow source files instead of GitHub MCP.
# Add to features
features:
mcp-cli: true
cli-proxy: true
# Remove from imports
# - uses: shared/mcp/gh-aw.md ← remove this
R3: Remove mount-as-clis and copilot-requests
Neither is needed for this analysis workflow.
# Remove from tools
# mount-as-clis: true ← remove
# Remove from features
# copilot-requests: true ← remove
R4: Condense the prompt (~30% reduction)
- Remove inline bash code blocks — the agent knows how to use
jq
- Remove redundant Mission section (duplicates Guiding Principles)
- Consolidate Phase 2 subsections into a compact table
- Replace verbose Phase 4 issue template with a bulleted outline
- Consolidate Phase 5 JSON example into a single line
Target: ~167 lines / 7.2KB (down from 300 lines / 10.3KB)
R5: Add bash pre-steps for data aggregation
Add two new bash steps after the log download:
- Pre-aggregate top workflows:
jq to compute top-10 workflows by total tokens, saving into /tmp/gh-aw/token-audit/top-workflows.json
- Load optimization history: Print previous optimizations so the agent does not need to discover this itself
Estimated savings: ~5 fewer agent turns
Projected Impact
The combination of toolset narrowing (R1), cli-proxy (R2), prompt condensing (R4), and pre-aggregation (R5) should significantly reduce token usage. The 14.9M figure was inflated by the port 8080 connectivity issue causing 20+ minutes of debugging, but even a healthy run would benefit from these changes.
References
- Discussion #27103 — Daily token audit report showing 14.9M usage
- PR #27080 — Fix for AWF port 8080 firewall that caused the run failure
Summary
The Copilot Token Usage Optimizer consumed 14.9M tokens in its single run on April 18 — the most token-intensive workflow that day (see discussion #27103). This is ironic for a workflow whose job is to reduce token usage in other workflows.
Current State
[default](~22 tools)Root Causes
1. Overly broad GitHub toolset
Uses
toolsets: [default]which loads ~22 GitHub tools. The workflow only needs to read workflow source files and create issues. Each tool schema adds ~2.5-3K tokens to the system prompt, repeated on every agent turn.2. No cli-proxy — uses GitHub MCP for file reads
The workflow reads target workflow
.mdfiles via GitHub MCP tools (get_file_contents). Enablingcli-proxywould let the agent use theghCLI directly (e.g.,gh api repos/.../contents/...), eliminating the need for the full GitHub MCP toolset.3. Verbose prompt with inline bash examples
The prompt is 300 lines / 10KB with multiple inline bash code blocks that the agent copies verbatim. These examples consume input tokens on every turn. The prompt can be condensed to ~167 lines / 7KB (30% reduction) by replacing verbose examples with concise instructions.
4.
mount-as-clis: trueadds overheadThis feature mounts MCP servers as CLI wrappers, adding unnecessary complexity. Can be removed.
5.
copilot-requests: trueis expensive and unnecessaryThis feature flag is not needed for a data analysis workflow. Removing it reduces overhead.
6. No pre-aggregation of data
The agent spends many turns running
jqqueries to aggregate workflow data. Adding a bash pre-step to compute top-10 workflows by token usage would eliminate ~5 turns of data gathering.7. Unnecessary
shared/mcp/gh-aw.mdimportThis shared component installs and copies the gh-aw binary for MCP server containerization. With
cli-proxyenabled, the agent can usegh awdirectly and this import is not needed.Recommended Changes
R1: Narrow GitHub toolset to
[issues]The workflow only creates issues via safe-outputs. It does not need repos, PRs, or code search tools.
Estimated savings: ~50-60K tokens/turn × many turns
R2: Enable
cli-proxyand removeshared/mcp/gh-aw.mdUse the
ghCLI for reading workflow source files instead of GitHub MCP.R3: Remove
mount-as-clisandcopilot-requestsNeither is needed for this analysis workflow.
R4: Condense the prompt (~30% reduction)
jqTarget: ~167 lines / 7.2KB (down from 300 lines / 10.3KB)
R5: Add bash pre-steps for data aggregation
Add two new bash steps after the log download:
jqto compute top-10 workflows by total tokens, saving into/tmp/gh-aw/token-audit/top-workflows.jsonEstimated savings: ~5 fewer agent turns
Projected Impact
The combination of toolset narrowing (R1), cli-proxy (R2), prompt condensing (R4), and pre-aggregation (R5) should significantly reduce token usage. The 14.9M figure was inflated by the port 8080 connectivity issue causing 20+ minutes of debugging, but even a healthy run would benefit from these changes.
References