docs: document and stabilize token/cost field semantics#317
Draft
zhongxuanwang-nv wants to merge 1 commit into
Draft
docs: document and stabilize token/cost field semantics#317zhongxuanwang-nv wants to merge 1 commit into
zhongxuanwang-nv wants to merge 1 commit into
Conversation
Add a canonical "Token and Cost Field Semantics" section to the provider response codecs page: a Usage and CostEstimate field reference, the per-provider token normalization table, an exporter field-mapping table (ATOF/ATIF/OpenInference/OpenTelemetry), and a stability contract. Add brief field pointers from the OpenTelemetry, OpenInference, and ATIF exporter pages, and a Known Issues entry noting ATIF derives token/cost from the raw event payload rather than the codec annotation. Lock the contract with two characterization tests: OpenTelemetry LLM end events emit cost only (no token-count or gen_ai attributes), and Usage ignores unmodeled provider subfields. Existing tests already cover the other projections and the USD-only/currency-aware cost behavior. No runtime behavior change. Signed-off-by: Zhongxuan Wang <daniewang@nvidia.com>
WalkthroughThe PR adds tests and documentation for LLM token and cost field handling across codec normalization and observability exporters. It also records an ATIF limitation that derives metrics from raw event payloads instead of codec annotation. ChangesToken and cost semantics
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
Member
Author
|
@coderabbitai review |
✅ Action performedReview finished.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Overview
Document and stabilize NeMo Relay's LLM token and cost field semantics (RELAY-243). This freezes the current behavior as a documented contract and locks it with characterization tests. There is no runtime behavior change.
Details
Adds a canonical Token and Cost Field Semantics section to
docs/integrate-into-frameworks/provider-response-codecs.mdx:UsageandCostEstimatefield reference (names, units, optionality).Usage).final_metricsaggregates), "missing ≠ zero", and "Relay does not convert currencies".0.1/ATIF-v1.7/ pricing catalogversion: 1; additive-only) plus the documented limitations.Brief field pointers + back-links were added to the OpenTelemetry, OpenInference, and ATIF exporter pages. Cost policy is stated once on the canonical page, per the runtime-contract docs convention (projections do not redefine policy). A self-contained Known Issues entry documents that ATIF derives token/cost from the raw event payload rather than the codec annotation, so codec-only usage/cost appears in OpenTelemetry/OpenInference but not in ATIF; aligning ATIF is deferred to a follow-up.
Two characterization tests lock the freeze:
nemo_relay.llm.cost.*keys, with no token-count orgen_ai.*attributes.Usageignores unmodeled provider subfields (forward-compat: no serde catch-all).Existing tests already cover the remaining projections, per-provider mapping, reasoning-tokens-in-
api_specific, and the USD-only/currency-aware cost behavior, so no duplicate tests were added.Testing: targeted
cargo test(both new tests pass; perturbing the OpenTelemetry exporter to emit a token attribute makes the cost-only test fail as intended, confirming the lock bites),just docs-linkcheck(0 errors), andpre-commit(SPDX, markdown linkcheck, cargo fmt/clippy/check) all pass.Follow-up (separate ticket): align ATIF token/cost extraction with the codec-normalized
annotated_response.usageso codec-only usage reaches ATIF step/final metrics; currently raw-output-sourced.Where should the reviewer start?
docs/integrate-into-frameworks/provider-response-codecs.mdx— the new Token and Cost Field Semantics section (the exporter field-mapping table and the Stability subsection are the core contract). Thencrates/core/tests/unit/observability/otel_tests.rs::llm_end_emits_cost_only_no_token_or_gen_ai_attributes.Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)
Summary by CodeRabbit
Documentation
Tests