Skip to content

PLEX-2935/generalize monitoring#2132

Merged
ilija42 merged 7 commits into
feature/solcap-read_triggerfrom
PLEX-2935/generalize-monitoring
Jun 18, 2026
Merged

PLEX-2935/generalize monitoring#2132
ilija42 merged 7 commits into
feature/solcap-read_triggerfrom
PLEX-2935/generalize-monitoring

Conversation

@Unheilbar

@Unheilbar Unheilbar commented Jun 8, 2026

Copy link
Copy Markdown
Contributor

JIRA
MonitoringContext() implementation by solana capability here

Summary

Adds --with-monitoring to v2 capability server generation. Action lifecycle monitoring (initiated / success / error) lives in the generated server wrapper, not in action implementations.

What changed

  • --with-monitoring — generated server wraps RPC dispatch with lifecycle logs + metrics
  • pkg/capabilities/v2/monitoring/ — shared monitoring package:
    • MonitoringContext() on ClientCapability (logger + chain/capability labels)
    • MonitoringLabels on request types (LogKVs() / MetricKVs())
    • Unified OTel metrics: capabilities_v2_action_count, capabilities_v2_action_duration (method, outcome labels)
  • Solana — first onboarded capability

Why

  • New RPCs get monitoring after go generate --with-monitoring
  • Removes Initiated/Success/Error boilerplate from action code
  • One counter + one histogram instead of per-method metric names
  • Log vs metric enrichment split to keep metrics low-cardinality
  • Default generation unchanged; capabilities migrate one at a time

Notes

  • Sub-execution events (e.g. WriteReport fee errors) and triggers still use legacy ProtoProcessor

Dashboard query comparison (EVM vs Solana v2)

EVM — one metric name per action × outcome, stitched with label_replace

Each RPC/outcome pair is a separate metric. Dashboards must list every series and derive an action label from __name__:

sum by (chain_id, action, env, network_name, network_name_full) (
  label_replace(rate(evm_capability_call_contract_error_count{platformEnv="staging-testnet",zone=~"zone-b|",donID=~"^cre.*",host_name!~"cl-cre-gateway-one-zone-b-.*", chain_id=~".*"}[5m]), "action", "call_contract_error", "__name__", ".*")
  or label_replace(rate(evm_capability_filter_logs_error_count{platformEnv="staging-testnet",zone=~"zone-b|",donID=~"^cre.*",host_name!~"cl-cre-gateway-one-zone-b-.*", chain_id=~".*"}[5m]), "action", "filter_logs_error", "__name__", ".*")
  or label_replace(rate(evm_capability_balance_at_error_count{platformEnv="staging-testnet",zone=~"zone-b|",donID=~"^cre.*",host_name!~"cl-cre-gateway-one-zone-b-.*", chain_id=~".*"}[5m]), "action", "balance_at_error", "__name__", ".*")
  or label_replace(rate(evm_capability_estimate_gas_error_count{platformEnv="staging-testnet",zone=~"zone-b|",donID=~"^cre.*",host_name!~"cl-cre-gateway-one-zone-b-.*", chain_id=~".*"}[5m]), "action", "estimate_gas_error", "__name__", ".*")
  or label_replace(rate(evm_capability_get_transaction_by_hash_error_count{platformEnv="staging-testnet",zone=~"zone-b|",donID=~"^cre.*",host_name!~"cl-cre-gateway-one-zone-b-.*", chain_id=~".*"}[5m]), "action", "get_transaction_by_hash_error", "__name__", ".*")
  or label_replace(rate(evm_capability_get_transaction_receipt_error_count{platformEnv="staging-testnet",zone=~"zone-b|",donID=~"^cre.*",host_name!~"cl-cre-gateway-one-zone-b-.*", chain_id=~".*"}[5m]), "action", "get_transaction_receipt_error", "__name__", ".*")
  or label_replace(rate(evm_capability_header_by_number_error_count{platformEnv="staging-testnet",zone=~"zone-b|",donID=~"^cre.*",host_name!~"cl-cre-gateway-one-zone-b-.*", chain_id=~".*"}[5m]), "action", "header_by_number_error", "__name__", ".*")
)

Adding a new RPC requires new metric names and a new label_replace branch in every panel.

Solana (v2 unified) — same panel shape, one metric + labels

Equivalent error-rate panel: method replaces derived action; outcome="error" replaces _error_count suffixes:

sum by (chain_id, method, network_name, network_name_full) (
  rate(capabilities_v2_action_count{
    outcome="error",
    chain_family_name="solana",
    platformEnv="staging-testnet",
    zone=~"zone-b|",
    donID=~"^cre.*",
    chain_id=~".*"
  }[5m])
)

New RPCs appear automatically as new method label values — no dashboard query changes.

Success rate by method (same unified model):

100 * sum by (chain_id, method, network_name, network_name_full) (
  rate(capabilities_v2_action_count{outcome="success", chain_family_name="solana", platformEnv="staging-testnet", zone=~"zone-b|", donID=~"^cre.*", chain_id=~".*"}[5m])
)
/ clamp_min(
  sum by (chain_id, method, network_name, network_name_full) (
    rate(capabilities_v2_action_count{chain_family_name="solana", platformEnv="staging-testnet", zone=~"zone-b|", donID=~"^cre.*", chain_id=~".*"}[5m])
  ),
  1e-9
)

@github-actions

github-actions Bot commented Jun 8, 2026

Copy link
Copy Markdown

⚠️ API Diff Results - github.com/smartcontractkit/chainlink-common

⚠️ Breaking Changes (2)

pkg/capabilities/v2/chain-capabilities/solana/server.ClientCapability (1)
  • MonitoringContext — ➕ Added
pkg/capabilities/v2/protoc/pkg (1)
  • GenerateServer — Type changed:
func(
  *google.golang.org/protobuf/compiler/protogen.Plugin, 
  *google.golang.org/protobuf/compiler/protogen.File, 
  ServerLanguage, 
  string, 
  string, 
  + bool
)
error

✅ Compatible Changes (4)

package github (1)
  • com/smartcontractkit/chainlink-common/pkg/capabilities/v2/monitoring — ➕ Added
pkg/capabilities/v2/chain-capabilities/solana.(*WriteReportRequest) (2)
  • LogKVs — ➕ Added

  • MetricKVs — ➕ Added

pkg/capabilities/v2/chain-capabilities/solana/server.ClientCapability (1)
  • MonitoringContext — ➕ Added

📄 View full apidiff report

@Unheilbar Unheilbar changed the title Plex 2935/generalize monitoring PLEX-2935/generalize monitoring Jun 8, 2026
Comment thread pkg/capabilities/v2/protoc/main.go Outdated
Comment thread pkg/capabilities/v2/chain-capabilities/solana/server/client_server_gen.go Outdated
Comment thread pkg/capabilities/v2/monitoring/context.go Outdated
Comment thread pkg/capabilities/v2/monitoring/execution_context.proto Outdated
Comment thread pkg/capabilities/v2/protoc/pkg/templates/server_with_monitoring.go.tmpl Outdated
@Unheilbar Unheilbar marked this pull request as ready for review June 18, 2026 17:13
@Unheilbar Unheilbar requested review from a team as code owners June 18, 2026 17:13

@ilija42 ilija42 left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits, really nice job

Comment thread pkg/capabilities/v2/protoc/pkg/templates/server.go.tmpl Outdated
c.actionMetrics.OnError(ctx, "GetAccountInfoWithOpts", tsStart, time.Now(), isUserError, capmon.ActionMetricAttributes("GetAccountInfoWithOpts", metadata, mc.MetricsAttributes, metricAttrs)...)
return nil, capabilities.ResponseMetadata{}, nil, err
}
mc.Logger.Infow("capability succeeded", logKvs...)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you removing these logs from solana cap actions in a followup?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep

@ilija42 ilija42 merged commit 7471895 into feature/solcap-read_trigger Jun 18, 2026
15 of 18 checks passed
@ilija42 ilija42 deleted the PLEX-2935/generalize-monitoring branch June 18, 2026 18:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants