Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 8 additions & 8 deletions coordinator/internal/logic/submitproof/proof_receiver.go
Original file line number Diff line number Diff line change
Expand Up @@ -346,14 +346,6 @@ func (m *ProofReceiverLogic) validator(ctx context.Context, proverTask *orm.Prov
return ErrValidatorFailureProofMsgStatusNotOk
}

// if prover task FailureType is SessionInfoFailureTimeout, the submit proof is timeout, need skip it
if types.ProverTaskFailureType(proverTask.FailureType) == types.ProverTaskFailureTypeTimeout {
m.validateFailureProverTaskTimeout.Inc()
log.Info("proof submit proof have timeout, skip this submit proof", "hash", proofParameter.TaskID, "taskType", proverTask.TaskType,
"proverName", proverTask.ProverName, "proverPublicKey", pk, "proofTime", proofTimeSec)
return ErrValidatorFailureProofTimeout
}

// store the proof to prover task
if updateTaskProofErr := m.updateProverTaskProof(ctx, proverTask, proofParameter); updateTaskProofErr != nil {
log.Warn("update prover task proof failure", "hash", proofParameter.TaskID, "proverPublicKey", pk,
Expand All @@ -368,6 +360,14 @@ func (m *ProofReceiverLogic) validator(ctx context.Context, proverTask *orm.Prov
"taskType", proverTask.TaskType, "proverName", proverTask.ProverName, "proverPublicKey", pk)
return ErrValidatorFailureTaskHaveVerifiedSuccess
}

// if prover task FailureType is SessionInfoFailureTimeout, the submit proof is timeout, but we still accept it
if types.ProverTaskFailureType(proverTask.FailureType) == types.ProverTaskFailureTypeTimeout {
m.validateFailureProverTaskTimeout.Inc()
log.Warn("proof submit proof have timeout", "hash", proofParameter.TaskID, "taskType", proverTask.TaskType,
"proverName", proverTask.ProverName, "proverPublicKey", pk, "proofTime", proofTimeSec)
}
Comment on lines +364 to +369
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

Timeout now only logs/metrics and proceeds — double‑check interaction with ProvingStatus guard and clean up comment/log text

The new behavior (incrementing validateFailureProverTaskTimeout and logging, but not returning) matches the PR goal of accepting late proofs after a timeout. A couple of points to verify/tidy up:

  1. Interaction with “submit twice” guard (lines 315‑329)
    If timeouts elsewhere in the system are recorded by setting the task’s ProvingStatus to types.ProverProofInvalid (or Valid), then the early guard:

    if types.ProverProveStatus(proverTask.ProvingStatus) == types.ProverProofValid ||
        types.ProverProveStatus(proverTask.ProvingStatus) == types.ProverProofInvalid {
        ...
        return ErrValidatorFailureProverTaskCannotSubmitTwice
    }

    will fire before this timeout block, and late proofs for such tasks will still be rejected. For the new behavior to be effective, timed‑out tasks you still want to accept later proofs for must not be marked ProverProofValid/ProverProofInvalid before the submission arrives. Please confirm that your timeout handling only sets FailureType to ProverTaskFailureTypeTimeout (see common/types/db.go:98-104) and leaves ProvingStatus in a state that passes this guard.

  2. Metrics semantics
    validateFailureProverTaskTimeout is documented as “validate failure timeout”, but it now increments even for proofs you accept and verify successfully. That’s probably fine if you interpret it as “number of submissions whose task had previously timed out”, but it will no longer be a strict subset of validateFailureTotal (which only increments on non‑nil err). Just keep this in mind for dashboards/alerts.

  3. Minor comment & log clarity (optional polish)

    • The comment mentions SessionInfoFailureTimeout, but the actual enum is ProverTaskFailureTypeTimeout.
    • The log message "proof submit proof have timeout" is hard to read.

    You could make these clearer with something like:

  • // if prover task FailureType is SessionInfoFailureTimeout, the submit proof is timeout, but we still accept it
  • // If prover task FailureType is ProverTaskFailureTypeTimeout, the submission has timed out, but we still accept it.

    if types.ProverTaskFailureType(proverTask.FailureType) == types.ProverTaskFailureTypeTimeout {
    m.validateFailureProverTaskTimeout.Inc()

  •   log.Warn("proof submit proof have timeout", "hash", proofParameter.TaskID, "taskType", proverTask.TaskType,
    
  •       "proverName", proverTask.ProverName, "proverPublicKey", pk, "proofTime", proofTimeSec)
    
  •   log.Warn("proof submission timed out but will be verified", "hash", proofParameter.TaskID, "taskType", proverTask.TaskType,
    
  •       "proverName", proverTask.ProverName, "proverPublicKey", pk, "proofTime", proofTimeSec)
    
    }
    
    
    
    
    

To help verify how timeouts are recorded across the codebase, you can run:


🏁 Script executed:

#!/bin/bash
# Inspect usages of ProverTaskFailureTypeTimeout and related status updates
rg -n "ProverTaskFailureTypeTimeout" -C3
rg -n "UpdateProverTaskProvingStatusAndFailureType" -C3

Length of output: 6120


Critical: Timeout-marked tasks are rejected before reaching the new acceptance logic

The script output reveals a blocking issue with the intended behavior. When a timeout is detected, collect_proof.go:237 calls:

UpdateProverTaskProvingStatusAndFailureType(c.ctx, assignedProverTask.UUID, types.ProverProofInvalid, types.ProverTaskFailureTypeTimeout, tx)

This sets ProvingStatus to ProverProofInvalid. However, the guard at lines 315–329 rejects submissions before the new timeout handler at lines 364–369 can execute:

if types.ProverProveStatus(proverTask.ProvingStatus) == types.ProverProofValid ||
    types.ProverProveStatus(proverTask.ProvingStatus) == types.ProverProofInvalid {
    ...
    return ErrValidatorFailureProverTaskCannotSubmitTwice
}

Result: Late proofs for timed-out tasks will always be rejected by the "cannot submit twice" guard, never reaching the new acceptance logic. The PR's goal cannot be achieved with this implementation.

Fix: Either (1) skip the "submit twice" guard when FailureType == ProverTaskFailureTypeTimeout, or (2) leave timed-out tasks with ProvingStatus unset (or set to a non-rejected state) so they pass the guard. The latter likely requires changes to collect_proof.go:237 as well.

🤖 Prompt for AI Agents
In coordinator/internal/logic/submitproof/proof_receiver.go around lines
315–369, the current "cannot submit twice" guard rejects tasks already marked
ProverProofInvalid before the timeout-acceptance logic at 364–369 runs; to fix,
either update the guard (lines ~315–329) to allow submissions when
proverTask.FailureType == types.ProverTaskFailureTypeTimeout by skipping the
ProverProofInvalid/ProverProofValid check for timeout-marked tasks, or instead
change the code that marks timeouts (collect_proof.go:237) so it does not set
ProvingStatus to ProverProofInvalid for timeout cases (leave it unset or set to
a non-rejecting state) so the existing guard won’t block late proofs—pick one
approach and implement the corresponding change consistently across both files.


return nil
}

Expand Down
Loading