Summary
The cli-proxy feature (replacing GitHub MCP server with gh CLI commands routed through a local proxy) shows a ~24% reduction in token usage and cost in the initial Smoke Copilot data. This is a single data point β more runs are needed to confirm the trend.
What cli-proxy changes
With cli-proxy enabled (features: cli-proxy: true):
- Before: Agent β GitHub MCP server (tool schemas injected into context) β GitHub API
- After: Agent β
gh bash commands β cli-proxy β DIFC proxy β GitHub API
- LLM calls unchanged: Agent β api-proxy (token tracker) β Copilot API
The key savings mechanism: MCP tool schemas are removed from the context window. GitHub MCP tools inject ~22 tool definitions (~500-700 tokens each, ~10-15K total) into every LLM turn. With cli-proxy, the agent uses gh CLI commands via bash instead, which requires no schema injection.
Observed data
Smoke Copilot (only workflow with before/after data)
| Metric |
Pre-cli-proxy (Apr 7) |
Post-cli-proxy (Apr 10) |
Change |
| Tokens/run |
~334K |
~262K |
-21.6% |
| Cost/run |
$0.68 |
$0.52 |
-23.5% |
| I/O ratio |
156:1 |
235:1 |
Higher (less output per input) |
| Cache hit rate |
42.2% |
β |
β |
| Requests/run |
~4 |
5 |
Similar |
Source data:
- Pre-cli-proxy: Report #1768 (Apr 7, 4 runs, avg $0.68/run)
- Post-cli-proxy: Manual artifact analysis of run Β§24222066741 (Apr 10, 5 requests, 262K tokens, $0.52)
cli-proxy enabled: PR #1820, merged Apr 8
Other cli-proxy workflows (no before/after comparison yet)
Caveats
- Limited data: Only 1 post-cli-proxy Smoke Copilot run was analyzed (the analyzer workflow had a bug preventing it from finding the data β see below)
- Branch difference: The post-cli-proxy run was on a PR branch (
chore/upgrade-ghaw-v0.68.0), not main. The gh-aw version upgrade may contribute to the token difference.
- I/O ratio increased: The higher I/O ratio (235:1 vs 156:1) suggests the agent may be sending more context per request, though producing similar output. This could be a side effect of longer bash command outputs vs structured MCP tool responses.
- No cache data: The post-cli-proxy run did not expose cache read/write breakdown in the raw JSONL (all via
copilot provider), making cache comparison impossible.
Token analyzer data gap (resolved)
The daily token usage reports (#1878, #1818) reported "Smoke Copilot produces no token-usage.jsonl" β this was incorrect. The data existed in the firewall-audit-logs artifact but the analyzer workflow was downloading the wrong artifact name (agent-artifacts).
Fixes:
- PR #1883: corrects the artifact name
- PR #1884: refactors all 4 token workflows to use
gh aw logs --json (which handles artifact naming internally)
Once merged, future reports will correctly include Smoke Copilot data, enabling proper trend tracking.
Expected savings at scale
If the ~24% reduction holds across workflows:
| Workflow |
Current avg cost/run |
Projected with cli-proxy |
Daily runs |
Daily savings |
| Smoke Copilot |
$0.68 |
$0.52 |
~4 |
~$0.64 |
| Build Test Suite |
$4.54 |
~$3.45 |
~3 |
~$3.27 |
| Secret Digger |
$0.40 (success) |
~$0.30 |
~15 |
~$1.50 |
Note: Build Test Suite and Secret Digger do not currently use cli-proxy. These are projections based on the Smoke Copilot observation.
Next steps
Summary
The cli-proxy feature (replacing GitHub MCP server with
ghCLI commands routed through a local proxy) shows a ~24% reduction in token usage and cost in the initial Smoke Copilot data. This is a single data point β more runs are needed to confirm the trend.What cli-proxy changes
With cli-proxy enabled (
features: cli-proxy: true):ghbash commands β cli-proxy β DIFC proxy β GitHub APIThe key savings mechanism: MCP tool schemas are removed from the context window. GitHub MCP tools inject ~22 tool definitions (~500-700 tokens each, ~10-15K total) into every LLM turn. With cli-proxy, the agent uses
ghCLI commands via bash instead, which requires no schema injection.Observed data
Smoke Copilot (only workflow with before/after data)
Source data:
cli-proxy enabled: PR #1820, merged Apr 8
Other cli-proxy workflows (no before/after comparison yet)
cli-proxy: truein PR #1862, not yet merged)cli-proxy: truein PR feat: enable cli-proxy for smoke-services and firewall-issue-dispatcherΒ #1862, not yet merged)Caveats
chore/upgrade-ghaw-v0.68.0), not main. The gh-aw version upgrade may contribute to the token difference.copilotprovider), making cache comparison impossible.Token analyzer data gap (resolved)
The daily token usage reports (#1878, #1818) reported "Smoke Copilot produces no token-usage.jsonl" β this was incorrect. The data existed in the
firewall-audit-logsartifact but the analyzer workflow was downloading the wrong artifact name (agent-artifacts).Fixes:
gh aw logs --json(which handles artifact naming internally)Once merged, future reports will correctly include Smoke Copilot data, enabling proper trend tracking.
Expected savings at scale
If the ~24% reduction holds across workflows:
Note: Build Test Suite and Secret Digger do not currently use cli-proxy. These are projections based on the Smoke Copilot observation.
Next steps