feat: add Trackio as a new experiment monitoring backend#8065
feat: add Trackio as a new experiment monitoring backend#8065chanduripranav wants to merge 1 commit into
Conversation
…#7964) - Add TrackioMonitor class in deepspeed/monitor/trackio.py - Add TrackioConfig with enabled and project fields in config.py - Add check_trackio_availability() helper in utils.py - Register TrackioMonitor in MonitorMaster in monitor.py - Trackio is a lightweight offline-first logging library with WandB-compatible API, logs can be visualized on HF Hub Signed-off-by: Pranav Chanduri <preethivardhanc@gmail.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 486c1022a4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| class TrackioConfig(DeepSpeedConfigModel): | ||
| """Sets parameters for Trackio monitor.""" | ||
|
|
||
| enabled: bool = False | ||
| """ Whether logging to Trackio is enabled. Requires `trackio` package is installed. """ | ||
|
|
||
| project: str = "deepspeed" |
There was a problem hiding this comment.
Add required Trackio tests and docs
This introduces a new monitoring backend but the diff only changes production code; the workspace AGENTS.md requires new features to include corresponding tests and documentation updates. Please add coverage for the Trackio config/MonitorMaster wiring and logging behavior, and update the monitor docs so users can discover the new trackio configuration.
Useful? React with 👍 / 👎.
Closes #7964
Summary
Adds Trackio as a new experiment monitoring backend to DeepSpeed,
following the existing pattern used by WandB, TensorBoard, Comet, and CSV monitors.
Trackio is a lightweight, offline-first logging library developed by
Hugging Face with a WandB-compatible API. Runs can be visualized as
an HF Space or dataset on the HF Hub.
Changes
deepspeed/monitor/trackio.py— newTrackioMonitorclass implementingthe
Monitorinterface withlog()andwrite_events()methodsdeepspeed/monitor/config.py— newTrackioConfigwithenabledandprojectfields; registered inget_monitor_config()andDeepSpeedMonitorConfig; included incheck_enabledvalidatordeepspeed/monitor/utils.py— newcheck_trackio_availability()helperwith install instructions
deepspeed/monitor/monitor.py—TrackioMonitorimported and wiredinto
MonitorMaster.__init__()andwrite_events()Usage
Add to your DeepSpeed config:
{ "trackio": { "enabled": true, "project": "my-deepspeed-run" } }Testing
TrackioMonitoronly initializes on rank 0, consistent with other monitorstrackionot installed