Skip to content

Vault capability advanced metrics#22354

Open
prashantkumar1982 wants to merge 14 commits into
developfrom
feat/vault-capability-advanced-metrics
Open

Vault capability advanced metrics#22354
prashantkumar1982 wants to merge 14 commits into
developfrom
feat/vault-capability-advanced-metrics

Conversation

@prashantkumar1982
Copy link
Copy Markdown
Contributor

@prashantkumar1982 prashantkumar1982 commented May 8, 2026

Summary

Adds observability for the Vault capability and OCR reporting plugin: per-request lifecycle timestamps and seqNrs from Capability.handleRequest through blob broadcast, pending queue writes, observation batch processing, state transition outcomes, and transmit, through successful capability response or timeout.

Also records platform_vault_plugin_local_queue_size on each Observation.

Metrics

  • platform_vault_request_lifecycle_stage_latency_ms — latency from first receipt (capability) to each pipeline stage
  • platform_vault_request_lifecycle_stage_rounds_delta — OCR seqNr delta from blob-chosen round for stages 3–7
  • platform_vault_capability_request_outcome_totalsuccess | timeout | response_error
  • platform_vault_request_lifecycle_timeout_total — capability-time context timeouts
  • platform_vault_capability_request_response_error_total
  • platform_vault_request_lifecycle_round_delta_skipped_total — when round delta cannot be computed (no blob-chosen baseline)
  • platform_vault_plugin_local_queue_size — items in local request store at Observation time

Implementation notes

  • Shared RequestLifecycleTracker wired from delegate.go into capability, reporting plugin factory, and transmitter.
  • Timeout path logs full lifecycle state including round deltas as -1 when unset.

Tickets:
https://smartcontract-it.atlassian.net/browse/CRE-3517
https://smartcontract-it.atlassian.net/browse/CRE-4072

prashantkumar1982 and others added 4 commits May 7, 2026 12:32
- Gate pending-queue reads in Observation and ValidateObservation when
  VaultForceEmptyOCRRounds is enabled; rename batch to currentPendingQueueItems
  and remove redundant GetPendingQueue call
- Add VaultForceEmptyOCRRounds GateLimiter to reporting plugin config;
  Close() and test helper wiring
- Resolve OCR plugin limits via BoundLimiter.Limit() in initializePluginLimits;
  extract helpers to plugin_utils.go (resolveVaultOCRBoundLimitInt, logging
  for forceEmptyOCRRounds)

Co-authored-by: Cursor <cursoragent@cursor.com>
Track per-request timestamps and seqNrs from capability.handleRequest through
Observation, StateTransition, and Transmit, emitting OTLP latency and round
delta metrics plus outcome counters for success, timeout, and response errors.

Add platform_vault_plugin_local_queue_size histogram on each Observation for
the local request store size.

Co-authored-by: Cursor <cursoragent@cursor.com>
@prashantkumar1982 prashantkumar1982 requested review from a team as code owners May 8, 2026 16:11
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

I see you updated files related to core. Please run make gocs in the root directory to add a changeset as well as in the text include at least one of the following tags:

  • #added For any new functionality added.
  • #breaking_change For any functionality that requires manual action for the node to boot.
  • #bugfix For bug fixes.
  • #changed For any change to the existing functionality.
  • #db_update For any feature that introduces updates to database schema.
  • #deprecation_notice For any upcoming deprecation functionality.
  • #internal For changesets that need to be excluded from the final changelog.
  • #nops For any feature that is NOP facing and needs to be in the official Release Notes for the release.
  • #removed For any functionality/config that is removed.
  • #updated For any functionality that is updated.
  • #wip For any change that is not ready yet and external communication about it should be held off till it is feature complete.

@prashantkumar1982 prashantkumar1982 marked this pull request as draft May 8, 2026 16:13
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

✅ No conflicts with other open PRs targeting develop

@trunk-io
Copy link
Copy Markdown

trunk-io Bot commented May 8, 2026

Static BadgeStatic BadgeStatic BadgeStatic Badge

View Full Report ↗︎Docs

@prashantkumar1982 prashantkumar1982 added the build-publish Build and Publish image to SDLC label May 8, 2026
@prashantkumar1982 prashantkumar1982 marked this pull request as ready for review May 8, 2026 19:37
@chatgpt-codex-connector
Copy link
Copy Markdown

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.

@prashantkumar1982 prashantkumar1982 requested a review from a team as a code owner May 8, 2026 23:34
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 8, 2026

CORA - Pending Reviewers

Codeowners Entry Overall Num Files Owners
/core/capabilities/ 💬 4 @smartcontractkit/keystone, @smartcontractkit/capabilities-team
/core/services/ocr*/ 💬 8 @smartcontractkit/foundations, @smartcontractkit/core
go.mod 💬 6 @smartcontractkit/core, @smartcontractkit/foundations
go.sum 💬 6 @smartcontractkit/core, @smartcontractkit/foundations
integration-tests/go.mod 💬 1 @smartcontractkit/core, @smartcontractkit/devex-tooling, @smartcontractkit/foundations
integration-tests/go.sum 💬 1 @smartcontractkit/core, @smartcontractkit/devex-tooling, @smartcontractkit/foundations

Legend: ✅ Approved | ❌ Changes Requested | 💬 Commented | 🚫 Dismissed | ⏳ Pending | ❓ Unknown

For more details, see the full review summary.

@cl-sonarqube-production
Copy link
Copy Markdown

@prashantkumar1982 prashantkumar1982 changed the title Vault capability lifecycle metrics and local queue size Vault capability advanced metrics May 9, 2026
Comment on lines +40 to +46
limiter, err := limits.MakeUpperBoundLimiter(factory, spec)
if err != nil {
return 0, fmt.Errorf("%s: %w", settingKey, err)
}
defer func() { _ = limiter.Close() }()

v, err := limiter.Limit(ctx)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to construct a limiter to just do a lookup:

Suggested change
limiter, err := limits.MakeUpperBoundLimiter(factory, spec)
if err != nil {
return 0, fmt.Errorf("%s: %w", settingKey, err)
}
defer func() { _ = limiter.Close() }()
v, err := limiter.Limit(ctx)
v, err := spec.GetOrDefault(ctx, factory.Settings)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

build-publish Build and Publish image to SDLC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants