Problem
The compute worker sets submission status to FINISHED before uploading scores and outputs. If push_scores() or push_output() fails (network error, timeout, API failure), the submission is marked Finished but has no scores or outputs.
This is mechanism M8 from the EEG Foundation Challenge incident analysis (report.md section 4).
Root Cause
In compute_worker.py, the flow was:
run.start() # Sets status to FINISHED at line 1448
if run.is_scoring:
run.push_scores() # Upload scores AFTER status update
run.push_output() # Upload outputs AFTER status update
If push_scores() fails, the submission stays Finished with no scores → data integrity violation.
Data Fingerprint
From the incident:
- "finished-but-no-score reports" observed in production
- Participants reported submissions marked as Finished with empty scores
Impact
- Users see
Finished submissions with no results
- Leaderboard is incomplete
- No retry mechanism exists to recover
Solution
- Reorder operations: upload scores/outputs before setting status to
FINISHED
- Add retry logic to
push_scores() with exponential backoff
- Validate HTTP responses and raise errors on 4xx/5xx
New Flow
run.start() # Completes scoring but doesn't set FINISHED
if run.is_scoring:
run.push_scores() # Upload scores first (with retries)
run.push_output() # Then upload outputs
if run.is_scoring:
run._update_status(SubmissionStatus.FINISHED) # Only now mark as FINISHED
Testing
Added comprehensive K6 integration test:
tests/k6/test_m8_finished_has_scores.js — verifies invariant: Finished ⟹ has scores
tests/k6/run_m8_test.sh — bash orchestrator
- Pass criteria:
finished_without_scores == 0
References
- Incident report:
report.md section 4, M8
- Fix recommendation:
report.md section 7.4
- Related: M2 (fire-and-forget status updates), M1 (scoring re-enqueue)
Problem
The compute worker sets submission status to
FINISHEDbefore uploading scores and outputs. Ifpush_scores()orpush_output()fails (network error, timeout, API failure), the submission is markedFinishedbut has no scores or outputs.This is mechanism M8 from the EEG Foundation Challenge incident analysis (report.md section 4).
Root Cause
In
compute_worker.py, the flow was:If
push_scores()fails, the submission staysFinishedwith no scores → data integrity violation.Data Fingerprint
From the incident:
Impact
Finishedsubmissions with no resultsSolution
FINISHEDpush_scores()with exponential backoffNew Flow
Testing
Added comprehensive K6 integration test:
tests/k6/test_m8_finished_has_scores.js— verifies invariant:Finished ⟹ has scorestests/k6/run_m8_test.sh— bash orchestratorfinished_without_scores == 0References
report.mdsection 4, M8report.mdsection 7.4