Skip to content

Optimize Copilot Token Usage Optimizer: reduce 14.9M tokens/run via toolset, prompt, and cli-proxy changes #27112

@lpcox

Description

@lpcox

Summary

The Copilot Token Usage Optimizer consumed 14.9M tokens in its single run on April 18 — the most token-intensive workflow that day (see discussion #27103). This is ironic for a workflow whose job is to reduce token usage in other workflows.

Current State

Metric Value
Tokens (single run) 14,886,509
Error rate 1.00/run (failed to file issue due to port 8080 firewall — fixed in #27080)
Actions minutes 37 min
Prompt size ~10KB (300 lines)
GitHub toolset [default] (~22 tools)

Root Causes

1. Overly broad GitHub toolset

Uses toolsets: [default] which loads ~22 GitHub tools. The workflow only needs to read workflow source files and create issues. Each tool schema adds ~2.5-3K tokens to the system prompt, repeated on every agent turn.

2. No cli-proxy — uses GitHub MCP for file reads

The workflow reads target workflow .md files via GitHub MCP tools (get_file_contents). Enabling cli-proxy would let the agent use the gh CLI directly (e.g., gh api repos/.../contents/...), eliminating the need for the full GitHub MCP toolset.

3. Verbose prompt with inline bash examples

The prompt is 300 lines / 10KB with multiple inline bash code blocks that the agent copies verbatim. These examples consume input tokens on every turn. The prompt can be condensed to ~167 lines / 7KB (30% reduction) by replacing verbose examples with concise instructions.

4. mount-as-clis: true adds overhead

This feature mounts MCP servers as CLI wrappers, adding unnecessary complexity. Can be removed.

5. copilot-requests: true is expensive and unnecessary

This feature flag is not needed for a data analysis workflow. Removing it reduces overhead.

6. No pre-aggregation of data

The agent spends many turns running jq queries to aggregate workflow data. Adding a bash pre-step to compute top-10 workflows by token usage would eliminate ~5 turns of data gathering.

7. Unnecessary shared/mcp/gh-aw.md import

This shared component installs and copies the gh-aw binary for MCP server containerization. With cli-proxy enabled, the agent can use gh aw directly and this import is not needed.

Recommended Changes

R1: Narrow GitHub toolset to [issues]

The workflow only creates issues via safe-outputs. It does not need repos, PRs, or code search tools.

# Before
tools:
  mount-as-clis: true
  github:
    toolsets: [default]

# After
tools:
  github:
    toolsets: [issues]

Estimated savings: ~50-60K tokens/turn × many turns

R2: Enable cli-proxy and remove shared/mcp/gh-aw.md

Use the gh CLI for reading workflow source files instead of GitHub MCP.

# Add to features
features:
  mcp-cli: true
  cli-proxy: true

# Remove from imports
# - uses: shared/mcp/gh-aw.md  ← remove this

R3: Remove mount-as-clis and copilot-requests

Neither is needed for this analysis workflow.

# Remove from tools
# mount-as-clis: true  ← remove

# Remove from features
# copilot-requests: true  ← remove

R4: Condense the prompt (~30% reduction)

  • Remove inline bash code blocks — the agent knows how to use jq
  • Remove redundant Mission section (duplicates Guiding Principles)
  • Consolidate Phase 2 subsections into a compact table
  • Replace verbose Phase 4 issue template with a bulleted outline
  • Consolidate Phase 5 JSON example into a single line

Target: ~167 lines / 7.2KB (down from 300 lines / 10.3KB)

R5: Add bash pre-steps for data aggregation

Add two new bash steps after the log download:

  1. Pre-aggregate top workflows: jq to compute top-10 workflows by total tokens, saving into /tmp/gh-aw/token-audit/top-workflows.json
  2. Load optimization history: Print previous optimizations so the agent does not need to discover this itself

Estimated savings: ~5 fewer agent turns

Projected Impact

The combination of toolset narrowing (R1), cli-proxy (R2), prompt condensing (R4), and pre-aggregation (R5) should significantly reduce token usage. The 14.9M figure was inflated by the port 8080 connectivity issue causing 20+ minutes of debugging, but even a healthy run would benefit from these changes.

References

  • Discussion #27103 — Daily token audit report showing 14.9M usage
  • PR #27080 — Fix for AWF port 8080 firewall that caused the run failure

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions