Skip to content

fix(finetune): make the inert LIBERO drop-gate honest + fail-closeable#231

Open
rylinjames wants to merge 1 commit into
mainfrom
fix/libero-drop-gate-inert
Open

fix(finetune): make the inert LIBERO drop-gate honest + fail-closeable#231
rylinjames wants to merge 1 commit into
mainfrom
fix/libero-drop-gate-inert

Conversation

@rylinjames

Copy link
Copy Markdown
Collaborator

Audit §3.4 / Part 1 #10.

Problem

libero_drop_gate is supposed to run teacher-vs-student LIBERO rollouts and abort the ship if task-success drops too far. But the rollout harness (tether.libero_harness) isn't shipped (LIBERO was archived 2026-04-17), so _run_teacher_student_rollouts always raises _LiberoUnavailable and the gate silently skips. Every distill ships with status "ok" having verified nothing — and a SKIP is indistinguishable from a PASS. A gate that can't fail is worse than no gate.

Fix

The sim harness can't be shipped here, so make the gap honest and give callers a fail-closed switch:

  • PostprocessReport gains libero_gate_status; the hook records the outcome on every path (passed / failed / crashed / skipped_unavailable / skipped_disabled / skipped_phase / skipped_missing_inputs). A skip is now visible and auditable — never read as a pass.
  • New extra_lerobot_args.libero_gate_require (default False): when True, an unavailable harness aborts the ship rather than silently skipping — a production distill can demand real task-success verification.
  • Docstring states plainly that the harness is absent in OSS builds and how to fail closed.

Tests

tests/test_libero_drop_gate.py (6): unavailable → skipped_unavailable + ships by default; require=Trueforce_abort; pass/fail record status + abort; skip + non-distill phase recorded. ruff clean.

🤖 Generated with Claude Code

Audit §3.4 / Part 1 #10.

libero_drop_gate runs teacher-vs-student LIBERO rollouts and aborts the ship
if task-success drops too far. But the rollout harness (tether.libero_harness)
isn't shipped (LIBERO was archived 2026-04-17), so _run_teacher_student_rollouts
ALWAYS raises _LiberoUnavailable and the gate silently skips — every distill
ships with status "ok" having verified nothing, and a SKIP is indistinguishable
from a PASS. A gate that can't fail is worse than no gate.

Can't ship the sim harness here, so make the gap honest and give callers a
fail-closed switch instead:
- PostprocessReport gains libero_gate_status; the hook records the outcome on
  EVERY path (passed / failed / crashed / skipped_unavailable /
  skipped_disabled / skipped_phase / skipped_missing_inputs). A skip is now
  visible and auditable, never read as a pass.
- New extra_lerobot_args.libero_gate_require (default False): when True, an
  unavailable harness ABORTS the ship rather than silently skipping — so a
  production distill can demand real task-success verification.
- Docstring states plainly that the harness is absent in OSS builds and how to
  fail closed.

Tests (tests/test_libero_drop_gate.py, 6): unavailable→skipped_unavailable +
ships by default; require=True→force_abort; pass/fail record status + abort;
skip + non-distill phase recorded. ruff clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant