-
Notifications
You must be signed in to change notification settings - Fork 1
Telemetry
CKB integrates with OpenTelemetry to answer the question static analysis can't: "Is this code actually used in production?"
By ingesting runtime metrics, CKB can:
- Detect dead code with high confidence
- Show actual call counts for any symbol
- Enrich impact analysis with observed callers
- Distinguish between "no static references" and "truly unused"
Add to .ckb/config.json:
{
"telemetry": {
"enabled": true,
"serviceMap": {
"my-api-service": "my-repo"
}
}
}Point your collector's exporter at CKB:
# otel-collector-config.yaml
exporters:
otlphttp:
endpoint: "http://localhost:9120/v1/metrics"
service:
pipelines:
metrics:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp]ckb telemetry statusYou should see:
Telemetry Status
Enabled: true
Last Sync: 2 minutes ago
Coverage:
Symbol: 78% (23,456 of 30,123 symbols have telemetry)
Service: 100% (3 of 3 repos mapped)
Coverage Level: HIGH
CKB requires sufficient telemetry coverage to enable certain features:
| Coverage | Symbol % | What's Available |
|---|---|---|
| High | ≥ 70% | Full dead code detection, high-confidence verdicts |
| Medium | 40-69% | Dead code detection with caveats, observed usage |
| Low | 10-39% | Basic observed usage only |
| Insufficient | < 10% | Telemetry features disabled |
Check your coverage:
ckb telemetry statusWith telemetry enabled, you can find code that's never called in production:
# Find dead code candidates (requires medium+ coverage)
ckb dead-code --min-confidence 0.7
# Scope to a specific module
ckb dead-code --scope internal/legacy
# Include low-confidence results
ckb dead-code --min-confidence 0.5Dead code confidence combines:
- Static analysis — No references found in code
- Telemetry — Zero calls observed over the period
- Match quality — How well telemetry maps to symbols
| Confidence | Meaning |
|---|---|
| 0.9+ | High confidence dead code — safe to remove |
| 0.7-0.9 | Likely dead — verify before removing |
| 0.5-0.7 | Possibly dead — investigate further |
| < 0.5 | Uncertain — may have dynamic callers |
Get observed usage for any symbol:
ckb telemetry usage --symbol "internal/api/handler.go:HandleRequest"Output:
Symbol: HandleRequest
Period: Last 90 days
Observed Usage:
Total Calls: 1,247,832
Daily Average: 13,864
Trend: stable
Match Quality: exact
Last Seen: 2 hours ago
| Quality | Meaning |
|---|---|
| exact | Symbol name matches telemetry span exactly |
| strong | High-confidence fuzzy match |
| weak | Low-confidence match — verify manually |
CKB needs to know which telemetry service corresponds to which repository.
{
"telemetry": {
"serviceMap": {
"api-gateway": "api-repo",
"user-service": "users-repo",
"payment-worker": "payments-repo"
}
}
}For services that follow naming conventions, use regex patterns:
{
"telemetry": {
"servicePatterns": [
{
"pattern": "^order-.*$",
"repo": "repo-orders"
},
{
"pattern": "^inventory-.*$",
"repo": "repo-inventory"
}
]
}
}- Explicit
serviceMapentry - Pattern match in
servicePatterns(first match wins) -
ckb_repo_idattribute in telemetry payload - Service name matches repo name exactly
# See which services aren't mapped
ckb telemetry unmapped
# Test if a service name would map
ckb telemetry test-map "my-service-name"When telemetry is enabled, analyzeImpact includes observed callers:
# Via CLI
ckb impact <symbol-id> --include-telemetry
# Via MCP
analyzeImpact({ symbolId: "...", includeTelemetry: true })The response includes:
-
observedCallers— Services that call this symbol at runtime -
blendedConfidence— Combines static + observed confidence -
observedOnly— Callers found only via telemetry (not in code)
This catches cases where static analysis misses dynamic dispatch or reflection.
Full configuration options:
{
"telemetry": {
"enabled": true,
"serviceMap": {
"service-name": "repo-id"
},
"servicePatterns": [
{ "pattern": "^order-.*$", "repo": "repo-orders" }
],
"aggregation": {
"bucketSize": "weekly",
"retentionDays": 180,
"minCallsToStore": 1,
"storeCallers": false,
"maxCallersPerSymbol": 20
},
"deadCode": {
"enabled": true,
"minObservationDays": 90,
"excludePatterns": ["**/test/**", "**/migrations/**"],
"excludeFunctions": ["*Migration*", "Test*", "*Scheduled*"]
},
"privacy": {
"redactCallerNames": false,
"logUnmatchedEvents": true
}
}
}| Setting | Default | Description |
|---|---|---|
enabled |
false | Enable telemetry features |
serviceMap |
{} | Maps service names to repo IDs |
servicePatterns |
[] | Regex patterns for service mapping |
aggregation.bucketSize |
"weekly" | Aggregation bucket size ("daily", "weekly", "monthly") |
aggregation.retentionDays |
180 | Days to retain telemetry data |
aggregation.minCallsToStore |
1 | Minimum calls to store (filter noise) |
aggregation.storeCallers |
false | Store caller service names |
aggregation.maxCallersPerSymbol |
20 | Max callers to store per symbol |
deadCode.enabled |
true | Enable dead code detection |
deadCode.minObservationDays |
90 | Minimum days of data before reporting |
deadCode.excludePatterns |
[...] | Path glob patterns to exclude |
deadCode.excludeFunctions |
[...] | Function name patterns to exclude |
privacy.redactCallerNames |
false | Redact caller service names in storage |
privacy.logUnmatchedEvents |
true | Log events that couldn't be matched |
CKB accepts telemetry via OTLP. Configure your OpenTelemetry Collector:
# otel-collector-config.yaml
exporters:
otlphttp/ckb:
endpoint: "http://localhost:9120"
tls:
insecure: true
service:
pipelines:
metrics:
receivers: [otlp]
processors: [batch]
exporters: [otlphttp/ckb]Required Metric: calls counter with these attributes:
-
code.function(required) — Function name -
code.filepath(recommended) — Source file path -
code.namespace(recommended) — Package/namespace -
code.lineno(optional) — Line number for exact matching
Resource Attributes:
-
service.name(required) — Maps to repo via serviceMap -
service.version(optional) — For trend analysis
# Check status and coverage
ckb telemetry status
# Get usage for a symbol
ckb telemetry usage --symbol "pkg/handler.go:HandleRequest"
# List unmapped services
ckb telemetry unmapped
# Test service name mapping
ckb telemetry test-map "my-service"
# Find dead code
ckb dead-code [--min-confidence 0.7] [--scope module]| Tool | Purpose |
|---|---|
getTelemetryStatus |
Coverage metrics and sync status |
getObservedUsage |
Runtime usage for a symbol |
findDeadCodeCandidates |
Symbols with zero runtime calls |
Enhanced tools:
-
analyzeImpact— AddincludeTelemetry: truefor observed callers -
getHotspots— IncludesobservedUsagewhen telemetry enabled
# Get status
curl http://localhost:8080/telemetry/status
# Get symbol usage
curl "http://localhost:8080/telemetry/usage/SYMBOL_ID?period=30d"
# Find dead code
curl "http://localhost:8080/telemetry/dead-code?minConfidence=0.7"
# List unmapped services
curl http://localhost:8080/telemetry/unmapped
# OTLP ingest endpoint (for collectors)
POST http://localhost:9120/v1/metricsAdd to .ckb/config.json:
{ "telemetry": { "enabled": true } }Your instrumentation may not cover enough symbols. Check:
- Are all services sending telemetry?
- Is
serviceMapconfigured correctly? - Run
ckb telemetry unmappedto find gaps
Possible causes:
- Symbol isn't called at runtime (it may actually be dead)
- Service mapping is wrong
- Telemetry span names don't match symbol names
Debug with:
ckb telemetry test-map "your-service-name"Reduce retention or use monthly aggregation:
{
"telemetry": {
"aggregation": {
"retentionDays": 90,
"bucketSize": "monthly"
}
}
}- Start with explicit serviceMap — Don't rely on auto-detection
- Check coverage before trusting dead-code — Medium+ coverage required
- Use 90-day periods — Catches infrequent code paths (monthly jobs, etc.)
- Verify before deleting — Even high-confidence dead code should be reviewed
- Monitor unmapped services — New services need to be added to serviceMap
In addition to runtime telemetry, CKB tracks internal metrics for MCP wide-result tools. This helps identify which tools experience heavy truncation and may benefit from Frontier mode.
For each wide-result tool invocation:
- Tool name — findReferences, searchSymbols, analyzeImpact, getCallGraph, getHotspots, summarizePr
- Total results — How many results were found
- Returned results — How many were returned after truncation
- Truncation count — How many were dropped
- Response bytes — Actual JSON response size in bytes
- Estimated tokens — Approximate token cost (~4 bytes per token)
- Execution time — Latency in milliseconds
Metrics are stored in SQLite (.ckb/ckb.db) and persist across MCP sessions.
# Last 7 days (default)
ckb metrics
# Last 30 days
ckb metrics --days=30
# Filter to specific tool
ckb metrics --tool=findReferences
# Human-readable format
ckb metrics --format=human
# Export for version comparison
ckb metrics export --version=v7.4 > benchmarks/baseline.json
ckb metrics export --version=v7.5 --output=benchmarks/v7.5.jsonUse ckb metrics export to create versioned snapshots for comparing across releases:
# Before v7.5 Frontier release
ckb metrics export --version=v7.4 > benchmarks/v7.4-baseline.json
# After Frontier implementation
ckb metrics export --version=v7.5 > benchmarks/v7.5-frontier.json
# Compare
diff benchmarks/v7.4-baseline.json benchmarks/v7.5-frontier.jsonExport includes:
-
version— Your custom version tag -
ckbVersion— Actual CKB version (e.g., "7.4.0") -
exportedAt— ISO 8601 timestamp -
period/since— Time window for the data
Example export output:
{
"version": "v7.4",
"ckbVersion": "7.4.0",
"exportedAt": "2025-12-23T13:20:30Z",
"period": "last 30 days",
"since": "2025-11-23",
"tools": [
{
"name": "searchSymbols",
"queryCount": 312,
"totalResults": 15234,
"totalReturned": 8456,
"totalTruncated": 6778,
"truncationRate": 0.445,
"totalBytes": 4780000,
"avgBytes": 15321,
"avgTokens": 3830,
"avgLatencyMs": 125,
"needsFrontier": true
},
{
"name": "getCallGraph",
"queryCount": 189,
"totalResults": 2341,
"totalReturned": 2341,
"totalTruncated": 0,
"truncationRate": 0,
"totalBytes": 890000,
"avgBytes": 4708,
"avgTokens": 1177,
"avgLatencyMs": 32,
"needsFrontier": false
}
]
}The needsFrontier flag is true when truncation rate exceeds 30%.
Initial telemetry data from CKB's own usage shows:
| Tool | Truncation Rate | Needs Frontier? |
|---|---|---|
| searchSymbols | 45% | Yes |
| getHotspots | 50% | Yes |
| findReferences | 18% | No |
| getCallGraph | 0% | No |
| analyzeImpact | 0% | No |
Conclusion: Frontier mode is worth implementing for searchSymbols and getHotspots only. The other tools rarely or never truncate with current limits.
Response bytes are measured by JSON-marshaling the response data before sending. This captures the actual payload size consumed by the LLM context window.
Typical response sizes observed:
| Tool | Avg Response | Avg Tokens |
|---|---|---|
| searchSymbols | 15-43 KB | 4,000-11,000 |
| getHotspots | 20-40 KB | 5,000-10,000 |
| findReferences | 8-15 KB | 2,000-4,000 |
| getCallGraph | 5-10 KB | 1,250-2,500 |
| analyzeImpact | 2-5 KB | 500-1,250 |
This data helps measure the actual impact of Frontier mode by comparing bytes before/after pagination.
getWideResultMetrics
Returns the same aggregated metrics via MCP. Useful for AI-driven analysis of tool performance.
| Truncation Rate | Recommendation |
|---|---|
| < 10% | Tool is performing well, no action needed |
| 10-30% | Monitor usage patterns |
| > 30% | Consider Frontier mode for this tool |
| > 50% | Frontier mode strongly recommended |
Metrics are retained for 90 days by default. Old records are cleaned up automatically.
RecordWideResult() → In-memory aggregator + SQLite persistence
↓
ckb metrics CLI ← GetWideResultAggregates()
↓
getWideResultMetrics MCP ← Same data via MCP
Persistence is non-blocking (async writes) to avoid impacting tool latency.
- Configuration — Full config reference including telemetry
- MCP-Tools — MCP tool reference
- API-Reference — HTTP endpoint details
- Prompt-Cookbook — Example prompts for dead code detection