Vault: VaultForceEmptyOCRRounds and plugin limit resolution#22345
Vault: VaultForceEmptyOCRRounds and plugin limit resolution#22345prashantkumar1982 wants to merge 5 commits into
Conversation
- Gate pending-queue reads in Observation and ValidateObservation when VaultForceEmptyOCRRounds is enabled; rename batch to currentPendingQueueItems and remove redundant GetPendingQueue call - Add VaultForceEmptyOCRRounds GateLimiter to reporting plugin config; Close() and test helper wiring - Resolve OCR plugin limits via BoundLimiter.Limit() in initializePluginLimits; extract helpers to plugin_utils.go (resolveVaultOCRBoundLimitInt, logging for forceEmptyOCRRounds) Co-authored-by: Cursor <cursoragent@cursor.com>
|
You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard. |
|
👋 prashantkumar1982, thanks for creating this pull request! To help reviewers, please consider creating future PRs as drafts first. This allows you to self-review and make any final changes before notifying the team. Once you're ready, you can mark it as "Ready for review" to request feedback. Thanks! |
|
I see you updated files related to
|
|
✅ No conflicts with other open PRs targeting |
… into feat/vault-force-empty-ocr-rounds
| func forceEmptyOCRRounds(ctx context.Context, lggr logger.Logger, vaultForceEmptyOCRRounds limits.GateLimiter) bool { | ||
| err := vaultForceEmptyOCRRounds.AllowErr(ctx) | ||
| if err == nil { | ||
| lggr.Warnw("VaultForceEmptyOCRRounds is enabled; pending queue is not read this OCR round — store-backed pending observation items are skipped") |
There was a problem hiding this comment.
Any chance we could add this logging closer to where we apply the side effect? Otherwise I'm afraid we could do code changes that render this logging ineffective which would be confusing
| if err != nil { | ||
| return nil, fmt.Errorf("could not fetch batch of requests: %w", err) | ||
| var currentPendingQueueItems []*vaultcommon.StoredPendingQueueItem | ||
| if !forceEmptyOCRRounds(ctx, r.lggr, r.cfg.VaultForceEmptyOCRRounds) { |
There was a problem hiding this comment.
Can we refactor this so that we deal with this as an early return instead? i.e. check this condition once, and if true immediately return an empty observation
There was a problem hiding this comment.
By the way -- did you intend to just set the current round's pending queue to empty? or did you also intend to make the local queue empty in the returned observation?
There was a problem hiding this comment.
check this condition once, and if true immediately return an empty observation
We still want to update the pending queue with newer items right? Thus not returning an empty observation.
By the way -- did you intend to just set the current round's pending queue to empty? or did you also intend to make the local queue empty in the returned observation?
Just set the current round's pending queue to empty, and start preparing new pending queue from the local queue.
… into feat/vault-force-empty-ocr-rounds
|





Summary
This change wires VaultForceEmptyOCRRounds (chainlink-common CRE setting) into the vault OCR2 plugin, refines how reporting plugin limits are resolved, and moves small helpers into
plugin_utils.go. It also adds request lifecycle instrumentation for Vault capability + OCR (OTLP via beholder), with a non-optionalRequestLifecycleTrackerfor the reporting plugin factory and transmitter.https://smartcontract-it.atlassian.net/browse/CRE-4071
https://smartcontract-it.atlassian.net/browse/CRE-3478
Request lifecycle stages
These are the stage string values used in traces, logs (
furthest_stageon timeout), and metric attributes where applicable. Order follows the happy path from capability ingress through OCR.receivedhandleRequest(start of trace; used forfurthest_stagewhen nothing else ran).blob_broadcastingblob_broadcastedwritten_to_pending_queueobserved_outcomestate_transition_outcometransmittedcapability_response_receivedLatency histograms are emitted from
receivedas time zero for stages starting atblob_broadcastingthroughcapability_response_received(only for stages that actually occurred). The rounds histogram measures OCR seqNr delta fromblob_broadcastingfor subsequent stages (blob_broadcastedthroughtransmitted).Metrics (OTLP)
Unless noted, lifecycle/capability metrics include attribute
config_digest(OCR config digest string).Capability + request lifecycle
platform_vault_capability_requests_received_totalhandleRequest).platform_vault_request_lifecycle_stage_latency_msstage:blob_broadcasting,blob_broadcasted,written_to_pending_queue,observed_outcome,state_transition_outcome,transmitted,capability_response_received. Wall time fromreceivedto that stage.platform_vault_request_lifecycle_stage_rounds_deltastage:blob_broadcasted,written_to_pending_queue,observed_outcome,state_transition_outcome,transmitted. SeqNr delta fromblob_broadcasting.platform_vault_request_lifecycle_round_delta_skipped_totalstage: same as rounds histogram whenblob_broadcastingnever ran (cannot compute delta).platform_vault_request_lifecycle_timeout_totalhandleRequestdeadline).platform_vault_request_lifecycle_pending_queue_not_in_local_queue_totalplatform_vault_request_lifecycle_transmit_not_in_local_queue_totalplatform_vault_capability_request_outcome_totaloutcome:success,timeout,response_error. Fortimeout, alsofurthest_stage: one of the lifecycle stage values above (furthest reached when the trace ended).platform_vault_capability_request_response_error_totalVault OCR reporting plugin (separate meter instruments)
These use attribute
configDigest(string) on recordings, plusmethodfor KV duration as listed in code.platform_vault_plugin_queue_overflowqueueSize,batchSizewhen local observation queue is truncated.platform_vault_plugin_kv_operation_duration_msmethodper wrapped KV call.platform_vault_plugin_local_queue_size{request})Disk usage (if enabled in plugin)
platform_vault_disk_usage_bytes