feat(dips): dedicated fast loop with offer-existence gate#1200
Open
MoonBoi9001 wants to merge 15 commits intomain-dipsfrom
Open
feat(dips): dedicated fast loop with offer-existence gate#1200MoonBoi9001 wants to merge 15 commits intomain-dipsfrom
MoonBoi9001 wants to merge 15 commits intomain-dipsfrom
Conversation
c4d0d52 to
861cddf
Compare
MoonBoi9001
added a commit
to edgeandnode/local-network
that referenced
this pull request
Apr 25, 2026
The indexer-agent's DIPs accept gate (graphprotocol/indexer#1200) queries the indexing-payments-subgraph for OfferStored entities before calling acceptIndexingAgreement. Without a configured endpoint the gate is a no-op and a dropped offer() tx loses the agreement to a permanent deterministic rejection. Set INDEXER_AGENT_INDEXING_PAYMENTS_SUBGRAPH_ENDPOINT alongside the existing INDEXER_AGENT_OFFCHAIN_SUBGRAPHS so the agent picks up the gate as soon as the subgraph deployment is detected. Also add inline shellcheck directives to silence pre-existing SC1091 / SC2153 notes the post-edit-lint hook surfaced.
78f4c6c to
790355a
Compare
2d3a942 to
33c5e22
Compare
Decouples DIPs proposal acceptance from the 120s reconciliation cycle into a dedicated 5s polling loop (startProposalAcceptanceLoop). The 300s RCA deadline left insufficient slack with the old 240s+ latency. Bundles the supporting fixes the new loop needs to land usefully: - Unpack the agreement and signature as separate arguments to acceptIndexingAgreement. The on-chain contract split the previously- packed SignedRCA arg; without unpacking, the call reverts with FailedCall(). - Create the local indexing rule for the deployment inside processProposal, before the accept tx fires. With the fast loop in place, accept now races reconciliation; the rule must exist first or reconciliation sees the new allocation as an orphan and tries to unallocate it (IE067). - Ensure the subgraph deployment exists on graph-node before the multicall. The contract's state-validation step looks the deployment up; if it isn't local the multicall reverts. - Log full revert context (reason, data, message, contract target) on accept failure. Previously this showed `error: null` for any custom error the parser didn't recognise.
Before: processProposal read a pending RCA and immediately called acceptIndexingAgreement. If dipper's offer() tx was evicted from the mempool (or hadn't confirmed yet), the contract reverted with RecurringCollectorInvalidSigner. handleAcceptError treated the CALL_EXCEPTION as a permanent deterministic failure and rejected the proposal, losing the slot to reassessment. After: before calling acceptIndexingAgreement, query the indexing-payments-subgraph for Offer(id: agreementId). Missing offer means dipper's submission hasn't reached the subgraph yet — stay pending and let the 5s acceptance loop re-pop the row on the next tick. If the RCA deadline is within 30s, give up and mark offer_never_landed so reassessment can pick a replacement. Wired via a new OfferMonitor helper and an optional indexingPaymentsSubgraph field on the network specification. When the field isn't configured (operator didn't wire it up) the gate is bypassed and the prior "try and see" behaviour is preserved. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Wrap collectAgreementPayments and matchAgreementAllocations bodies so a transient subgraph or RPC failure logs and skips this tick instead of throwing through mapNetworkMapped and aborting the entire reconcile cycle for every configured network. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
33c5e22 to
76346c1
Compare
Adds a fourth case to the offer-existence-gate suite asserting that when indexingPaymentsSubgraph isn't configured, offerMonitor is null and the gate is skipped (acceptIndexingAgreement still runs). Locks in the "prior behaviour preserved" claim from the PR body. Promotes the inline safetyMarginSeconds to a file-level constant (OFFER_GATE_DEADLINE_SAFETY_MARGIN_SECONDS) alongside DIPS_ACCEPTANCE_INTERVAL so the timing budget is discoverable from the top of the file. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Base automatically changed from
feat/dips-new-subgraph
to
feat/dips-on-chain-cancel
April 28, 2026 19:37
Base automatically changed from
feat/dips-on-chain-cancel
to
feat/dips-on-chain-collect
April 28, 2026 20:09
f792bc2 to
078630f
Compare
Wraps each phase of processProposal with process.hrtime.bigint timers and emits a structured INFO summary per proposal (rule check, offer existence check, graphNode.ensure, accept). Outcome label disambiguates the early-return paths (deadline, blocklist, offer-gate) from the accept phase. Pure observability; no logic changes. Existing log lines for "Proposal accepted on-chain", "Rejecting proposal: deterministic contract error", and "Transient error accepting proposal, will retry" continue to disambiguate the accept outcome. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a classifyAbiMismatch helper that pattern-matches ethers errors arising from ABI shape mismatches at the encode/dispatch layer: - UNSUPPORTED_OPERATION with operation: "fragment" (wrong selector) - INVALID_ARGUMENT (wrong argument tuple) These errors mean the installed contract types disagree with what the deployed contract expects. Retrying does not help; the call will fail identically every tick. Marking the proposal rejected immediately and cleaning up the dips rule lets dipper reassess in seconds rather than burning the full RCA deadline. Without this, a single class of failure (e.g. installed @graphprotocol/interfaces ABI lagging the deployed contract) cascades into mass agreement expirations, and via dipper's 30-day decline lookback, can blocklist every (indexer, deployment) pair after just one failed batch. Existing CALL_EXCEPTION handling (revert data + tryParseCustomError) is unchanged; this only adds the abi-mismatch fast-path before it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Moves graphNode.ensure from processProposal into ensureDipsRuleForProposal. The deploy is idempotent on graph-node, so moving it to rule-creation time means each indexer pays the cold-deploy cost once per (deployment, proposal) instead of once per accept-loop tick per agreement. At 50-request scale (3 candidates per request, ~25 proposals queueing on each agent's accept loop) the previous placement could re-invoke ensure ~25 times on each tick across the same handful of deployments. With this change the cold-deploy work happens at the moment dipper's proposal first lands, before the offer-existence gate, and is amortised across any reassessment cycles for the same deployment. If graphNode.ensure throws, the proposal stays pending (returns false from ensureDipsRuleForProposal) so the next acceptance-loop tick retries — matching the existing retry semantics of other transient failures. The phase-timing log added in the previous commit attributes ensure duration to ruleMs going forward (one fewer phase reported, but the total wall time captured by totalMs is unchanged). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Replaces the serial for-loop in startProposalAcceptanceLoop with a pMap over proposals at DIPS_ACCEPT_CONCURRENCY=4. Each processProposal call operates on a distinct agreementId with no shared mutable state, so the parallel work is independent. The wallet's nonce queue inside transactionManager.executeTransaction already handles ordering and retries on collision, so concurrent submits are safe. Concurrency is capped low enough that one slow in-flight call doesn't head-of-line everything else, while removing the serial-tick bottleneck observed at 50-request scale (where ~25 proposals per agent per tick had to drain one at a time). Combined with the previous commit hoisting graphNode.ensure out of processProposal, the accept critical path is now tx-bound rather than deploy-bound, and parallelising that path yields a meaningful throughput multiplier. Per-proposal failures stay isolated: handleAcceptError already absorbs errors without rethrowing, and the explicit try/catch wrapper around processProposal here keeps the same isolation under pMap with stopOnError: false. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Adds a periodic sweep on the indexer-agent that diffs each dips-basis indexing rule against the indexing-payments-subgraph's view of the indexer's accepted agreements. If a rule has no matching IndexingAgreement in state Accepted, the rule is deleted; the agent's normal reconciliation loop then closes the corresponding allocation through its existing path. Defends against the "indexer indexing without payment" failure mode: when something kills the originally-paired agreement (dipper expires it, agent restarts mid-flow, DB cleanup loses in-flight context), the rule survives and the agent keeps the allocation alive. Without an oracle the agent has no signal that the allocation is unpaid. The indexing-payments-subgraph is the right oracle because it tracks exactly the on-chain accept events that bind agreements to indexers. Single batched query (indexingAgreements where indexer = SELF, state = Accepted) plus _meta block timestamp; the timestamp gates the sweep so we never disable rules from stale subgraph data. If the subgraph is more than DIPS_SWEEP_STALENESS_THRESHOLD_SECONDS (300s) behind wall-clock, or the query errors, the sweep skips this tick and logs. Loop registered alongside startProposalAcceptanceLoop in AllocationManager. The sweep is independently configurable: it runs every DIPS_SWEEP_INTERVAL (60s) on a separate timer from the 5s accept loop, so a slow sweep tick can't backpressure acceptance. Tests cover: stale rule removal, healthy rule preservation, mixed backed/unbacked, subgraph staleness skip, query error skip, and the no-subgraph-configured no-op path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
When an indexer accepts an offer to index a subgraph, the indexer-agent confirms that acceptance on-chain by calling the SubgraphService contract. Until now, that acceptance was tied to the agent's 120-second reconciliation cycle, which means the agent needed two or more cycles (240+ seconds) before it actually attempted the on-chain call. Recurring collection agreements have a 300-second deadline. The slack between the two was minimal — and on a fresh deploy with any backlog at all, agreements expired before the agent got around to accepting them.
A second, related failure mode shows up under load. dipper submits an
offer()transaction on-chain ahead of asking the indexer to accept; if dipper's transaction is evicted from the mempool — for example, because another service holding the same wallet sends a higher-fee transaction in the same window — the indexer-agent walks up to a contract that has no record of the offer. The call reverts, the agent treats the revert as a permanent rejection, and the indexing slot is lost to reassessment even though the only real fix was waiting one more tick for dipper to resubmit.The combined effect is that DIPs agreements expire or get reassessed when they should land cleanly. Operators see opaque revert errors and can't tell whether the system is broken or just unlucky.
Summary
startProposalAcceptanceLoop) that pollspending_rca_proposalsindependently of the main reconciliation cycle. Acceptance attempts now happen well inside the 300-second deadline.OfferMonitorthat queries the indexing-payments-subgraph for the offer before the agent attempts to accept. If the offer isn't yet on-chain, the agent stays pending and the 5s loop re-pops the row on the next tick. If the agreement deadline is within 30 seconds, the agent gives up cleanly withoffer_never_landedso reassessment can pick a replacement.acceptIndexingAgreement, creating the local indexing rule before the accept transaction fires (so the new loop doesn't race the reconciliation cycle into orphan-allocation territory), ensuring the subgraph deployment exists on graph-node before the multicall, and logging the full revert context on accept failure.collectAgreementPaymentsandmatchAgreementAllocationsin try/catch so a transient subgraph or RPC failure skips that tick instead of aborting the entire reconcile cycle for every configured network.Changes
packages/indexer-common/src/indexing-fees/offer-monitor.ts— converts the UUID-formatagreementIdfrompending_rca_proposalsto the bytes16 hex form the subgraph keys offers by; treats subgraph errors as transient.indexingPaymentsSubgraphfield on the network specification, plumbed through the agent's CLI flags. When absent, the gate is bypassed and the prior behaviour is preserved.startProposalAcceptanceLoopindips.tsplus its registration fromagent.ts.OfferMonitor, three branches of the gate inprocessProposal(offer absent + deadline far → wait; offer absent + deadline near → reject; offer present → proceed), and supporting test scaffolding forgraphNode.ensure.