transformerless_lm: continuous self-distillation + cycle checkpointing by RandomCoder-lab · Pull Request #5 · RandomCoder-lab/OMC

RandomCoder-lab · 2026-05-23T03:20:09Z

Summary

Follow-up to PR #4 (omniweight loss on training data). After closing the train/inference asymmetry, the natural question was whether the self-distillation ratchet (active_base ← seed + appended best refined outputs) compounds further than the bounded 6-cycle window allows. This PR makes the loop unbounded and Ctrl-C-resumable.

Changes

Two new flags on train_self_recursive.py:

--continuous — replaces for cycle in range(n_cycles) with an unbounded loop. n_cycles still controls steps_per_cycle = args.steps // n_cycles so per-cycle training budget stays calibrated; the cycle counter just keeps going. K-shrink schedule clamps to K_min once global_step exceeds args.steps, which is the standard end state of the curriculum.
--checkpoint PATH — serializes the entire distillation state every cycle:
- model state_dict + FibAdamW optimizer state
- active_base tensor (the growing self-distilled corpus)
- cycle counter, global_step, best_creativity, best_val/step
- cycle_summary, rejection counters, best_refined_seq
Atomic write via tmp + os.replace so an interrupt mid-save can't corrupt the file. If the checkpoint exists at startup, training resumes from saved_cycle + 1 with active_base fully intact.

Behavior

Default (no flags): byte-identical to v88 + omniweight-loss bounded 6-cycle run.
--continuous --checkpoint X.pt: runs forever, saves after every cycle. Ctrl-C and resume works.

Usage

# Forever self-distillation with omniweight loss
python3 train_self_recursive.py --omniweight-loss \
    --continuous --checkpoint omniweight_distill.pt

# Stop with Ctrl-C, resume by re-running the same command:
python3 train_self_recursive.py --omniweight-loss \
    --continuous --checkpoint omniweight_distill.pt
# Output: "resuming from checkpoint: omniweight_distill.pt"
#         "resumed at cycle N, active_base=M tokens, best_creativity=X"

Test plan

Smoke: 2 cycles → Ctrl-C → resume → verify cycle counter, active_base size, best_creativity all restored
Long run: continuous + omniweight-loss with --steps 6000 --n-cycles 6 and let it ride 20+ cycles, log creativity trajectory
Compare cycle-by-cycle creativity in continuous mode vs the bounded v89 numbers (cycle 1 mean 0.6460, cycle 2 mean 0.6426 from the killed run)

Generated by Claude Code

After PR #4 closed the train/inference omniweight asymmetry, the natural follow-up: don't stop at cycle 6. The active_base ratchet (seed + appended best refined outputs) is exactly the kind of process where compounding past a fixed budget might find regimes the 6-cycle window can't reach. --continuous: replaces `for cycle in range(n_cycles)` with an unbounded loop. n_cycles still controls steps_per_cycle (args.steps // n_cycles) so per-cycle training budget stays calibrated; the cycle counter just keeps going. K-shrink schedule clamps to K_min once global_step exceeds args.steps, which is the standard end state of the curriculum anyway. --checkpoint PATH: serializes the entire distillation state every cycle (model state_dict, FibAdamW optimizer state, active_base, cycle counter, global_step, best_creativity, best_val/step, cycle_summary, rejection counters, best_refined_seq). Atomic write via tmp+os.replace so an interrupt mid-save can't corrupt the file. If the checkpoint exists at startup, training resumes from the saved cycle+1 with the active_base fully intact -- the ratchet picks up exactly where it stopped. Default behavior unchanged: omitting both flags reproduces the v88 + omniweight-loss bounded 6-cycle run. Run a forever-distillation with omniweight-loss: python3 train_self_recursive.py --omniweight-loss \\ --continuous --checkpoint omniweight_distill.pt Resume after Ctrl-C: re-run the same command. Checkpoint state restored, next cycle is start_cycle.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

transformerless_lm: continuous self-distillation + cycle checkpointing#5

transformerless_lm: continuous self-distillation + cycle checkpointing#5
RandomCoder-lab wants to merge 1 commit into
masterfrom
claude/find-claude-md-arn0F

RandomCoder-lab commented May 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

RandomCoder-lab commented May 23, 2026

Summary

Changes

Behavior

Usage

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants