Skip to content

Add PipelineTrainer checkpoint artifact uploads#671

Open
arcticfly wants to merge 3 commits intomainfrom
codex/save-checkpoint-artifact
Open

Add PipelineTrainer checkpoint artifact uploads#671
arcticfly wants to merge 3 commits intomainfrom
codex/save-checkpoint-artifact

Conversation

@arcticfly
Copy link
Copy Markdown
Collaborator

Summary

  • add a save_checkpoint_artifact parameter alongside save_checkpoint on PipelineTrainer
  • upload saved eval checkpoints to W&B as LoRA artifacts via the existing W&B deployment helper
  • normalize compatible Unsloth Llama base-model aliases to W&B-supported metadata IDs
  • add focused unit coverage for the new trainer parameter and W&B base-model aliasing

Validation

  • uv run ruff check src/art/pipeline_trainer/trainer.py src/art/utils/deployment/wandb.py tests/unit/test_pipeline_trainer_local_backend.py
  • python -m py_compile src/art/pipeline_trainer/trainer.py src/art/utils/deployment/wandb.py
  • import/signature check confirmed PipelineTrainer.__init__ includes save_checkpoint_artifact
  • downstream Willow smoke job saved W&B LoRA artifacts at step1 and step2, then both artifacts responded to W&B inference requests

Note: the full ART unit file collection was blocked locally by an existing environment mismatch: transformers requires huggingface-hub<1.0 but the ART venv has huggingface-hub==1.7.2.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant