Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
93 changes: 93 additions & 0 deletions .agents/skills/puzzletron/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,93 @@
# Puzzletron Agent Skill

Puzzletron is an end-to-end workflow for model pruning and MIP-based architecture optimization.
This skill exposes it as a slash command for AI coding agents.

For full environment setup, model configuration, and algorithm details see
[examples/puzzletron/README.md](../../examples/puzzletron/README.md).

> **Experimental:** AI agent integration is an experimental feature and may change.

Run `/puzzletron` with no arguments to see available commands.

## Running the MIP step

Start the MIP step by telling the agent how many GPUs per node to use:

```text
/puzzletron mip 4
```

Output is streamed live and also written to `./log.txt`. While it runs (or after it finishes),
check progress with:

```text
/puzzletron mip progress
```

Example output when complete:

```text
Overall: Puzzletron step 7/8 — MIP sweep (6 compression rates)
──────────────────────────────────────────────────────────────
Status Phase Elapsed
──────────────────────────────────────────────────────────────
[DONE] Prep (teacher memory + rate list) <1s
[DONE] compression_rate=0.5 3m 52s
[DONE] compression_rate=0.6 4m 41s
[DONE] compression_rate=0.7 4m 46s
[DONE] compression_rate=0.8 3m 55s
[DONE] compression_rate=0.9 3m 55s
[DONE] compression_rate=1.0 3m 59s
──────────────────────────────────────────────────────────────
Started: 08:05:30
Finished: 08:30:38
Elapsed: 25m 8s
Completed: 6/6 compression rates
Remaining: done estimated

Results: /workspace/puzzle_dir/mip_sweep_results.csv
```

While running, the report shows which rate is active, sub-step detail (MIP solver node count
or validation batch progress), and an estimated time remaining based on completed rates.

## Running the full pipeline

To run all 8 pipeline steps (not just the MIP sweep):

```text
/puzzletron all 2
```

Check progress with:

```text
/puzzletron all progress
```

Example output while running:

```text
Overall: Puzzletron full pipeline (steps 1–8)
────────────────────────────────────────────────────────────────────
Status Step Description Elapsed
────────────────────────────────────────────────────────────────────
[DONE] 1/8: starting puzzletron pipeline 0m 0s
[DONE] 2/8: converting model to Puzzletron heterogeneous format (single-gpu) 0m 26s
[DONE] 3/8: scoring pruning activations (multi-gpu) 9m 9s
[DONE] 4/8: pruning the model and saving pruned checkpoints (single-gpu) 0m 57s
[DONE] 5/8: building replacement library and subblock statistics (single-gpu) 0m 26s
[RUNNING] 6/8: calculating one block scores (multi-gpu) (270/352 solutions) 100m 6s
[ ] 7/8: pending
[ ] 8/8: pending
────────────────────────────────────────────────────────────────────
Started: 00:08:50
Finished: 01:59:54 (in progress)
Elapsed: 111m 4s
Completed: 5/8 steps
Remaining: 56m 24s estimated
```

Step 6 progress is tracked via completed `solution_N.json` files on disk for an accurate
remaining estimate. Step 7 (MIP sweep) shows per-rate progress once it starts.
94 changes: 94 additions & 0 deletions .agents/skills/puzzletron/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
---
name: puzzletron
description: "End-to-end workflow for model pruning and MIP-based optimization. Commands: mip, all. Usage: /puzzletron <command>"
license: Apache-2.0
---

# Puzzletron

## Routing

**STEP 1 — Check args before doing anything else. This is MANDATORY.**

- If args are **empty**, output the block below verbatim and **STOP immediately. Do NOT proceed to any command.**
- If the first word of args does **not exactly match** `mip` or `all`, output the block below verbatim and **STOP immediately. Do NOT proceed to any command.**

---

**Puzzletron** — end-to-end workflow for model pruning and MIP-based optimization.

Available commands:
- `mip <nproc_per_node>` — Run the MIP step (nproc_per_node: number of GPUs per node)
- `mip progress` — Show live MIP progress with timing summary
- `all <nproc_per_node>` — Run the full Puzzletron pipeline (nproc_per_node: number of GPUs per node)
- `all progress` — Show live full pipeline progress with timing summary

Usage: `/puzzletron <command> [args]`

---

**STEP 2 — Only if the first word of args exactly matches a command name, execute it. Never reach this step if args were empty.**

## Command: all

Parse `nproc_per_node` from args using either positional or flag syntax:
- Positional: second word is a number, e.g. `all 2`
- Flag: `--nproc_per_node <value>` anywhere in args, e.g. `all --nproc_per_node 2`

- If the second word is exactly `progress`, execute the **all progress** sub-command below.
- If no `nproc_per_node` value can be found, ask the user: "Please provide the number of GPUs per node (nproc_per_node)." and **STOP**.
- If the value does not match `^[0-9]+$`, ask the user: "nproc_per_node must be a positive integer." and **STOP**.
- Otherwise use the parsed value and run the full pipeline.
Comment thread
coderabbitai[bot] marked this conversation as resolved.

### all \<nproc_per_node\>

Run the following Bash command, substituting `<nproc_per_node>` with the parsed value:

```bash
set -o pipefail && export PYTHONPATH=$PYTHONPATH:/workspace/Model-Optimizer && \
torchrun --nproc_per_node <nproc_per_node> examples/puzzletron/main.py \
--config examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yaml \
2>&1 | tee ./log.txt | grep "Puzzletron Progress"
```

Stream output to the user as it arrives. When the command finishes, report the exit code.
Comment thread
coderabbitai[bot] marked this conversation as resolved.

### all progress

Run the following Bash command. Present the output to the user wrapped in a fenced code block (``` ... ```).

```bash
python3 .agents/skills/puzzletron/all_progress.py
```

## Command: mip

Parse `nproc_per_node` from args using either positional or flag syntax:
- Positional: second word is a number, e.g. `mip 2`
- Flag: `--nproc_per_node <value>` anywhere in args, e.g. `mip --nproc_per_node 2`

- If the second word is exactly `progress`, execute the **mip progress** sub-command below.
- If no `nproc_per_node` value can be found, ask the user: "Please provide the number of GPUs per node (nproc_per_node)." and **STOP**.
- If the value does not match `^[0-9]+$`, ask the user: "nproc_per_node must be a positive integer." and **STOP**.
- Otherwise use the parsed value and run the MIP step.

### mip \<nproc_per_node\>

Run the following Bash command, substituting `<nproc_per_node>` with the parsed value:

```bash
set -o pipefail && export PYTHONPATH=$PYTHONPATH:/workspace/Model-Optimizer && \
torchrun --nproc_per_node <nproc_per_node> examples/puzzletron/main.py \
--config examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yaml \
--mip-only 2>&1 | tee ./log.txt | grep "Puzzletron Progress"
```

Stream output to the user as it arrives. When the command finishes, report the exit code.

### mip progress

Run the following Bash command. Present the output to the user wrapped in a fenced code block (``` ... ```).

```bash
python3 .agents/skills/puzzletron/mip_progress.py
```
174 changes: 174 additions & 0 deletions .agents/skills/puzzletron/all_progress.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
# SPDX-License-Identifier: Apache-2.0
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Generated with Claude Code
"""Progress report for the full Puzzletron pipeline (all 8 steps)."""

import glob
import re
import sys
from datetime import datetime

LOG = "./log.txt"
try:
lines = open(LOG).readlines()
text = "".join(lines)
except FileNotFoundError:
print("No log.txt found. Run /puzzletron all first.")
sys.exit(0)


def fmt(s):
"""Format seconds as 'Xm Ys', or '—' if None."""
return f"{int(s) // 60}m {int(s) % 60}s" if s is not None else "—"


def get_ts(line):
"""Extract a datetime from a log line timestamp, or None."""
m = re.search(r"\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})", line)
return datetime.strptime(m.group(1), "%Y-%m-%d %H:%M:%S") if m else None


now = datetime.now().replace(microsecond=0)
DIV = "─" * 68

step_events = []
for line in lines:
m = re.search(r"Puzzletron Progress (\d+)/(\d+): (.+)", line)
if m:
step_num = int(m.group(1))
total_steps = int(m.group(2))
desc = m.group(3).strip()
ts = get_ts(line)
step_events.append((step_num, total_steps, desc, ts))

total_steps = step_events[-1][1] if step_events else 8
seen_steps = {e[0]: (e[2], e[3]) for e in step_events}
last_step_num = max(seen_steps.keys()) if seen_steps else 0

pipeline_complete_ts = None
if last_step_num == total_steps and total_steps in seen_steps:
pipeline_complete_ts = seen_steps[total_steps][1]

cur_detail = ""
step_remaining = None
batch_matches = re.findall(r"calculate_losses_pipeline[^:]*:\s*(\d+)%.*?(\d+)/(\d+)", text)
cbc_matches = re.findall(r"After (\d+) nodes.*?\(([\d.]+) seconds\)", text)

sol_dir_match = re.search(
r"'output_dir': '([^']+single_sequence_replacement_solutions--validation[^']*)'", text
)
sol_done, sol_total = None, None
if sol_dir_match:
sol_dir = sol_dir_match.group(1)
sol_done = len(glob.glob(f"{sol_dir}/solution*.json"))
sol_list_match = re.search(r"'solutions_to_validate': \[([\d, ]+)\]", text)
if sol_list_match:
sol_total = len(sol_list_match.group(1).split(","))
pct, cur_b, total_b = batch_matches[-1] if batch_matches else (None, None, None)
if sol_done is not None and sol_total:
cur_detail = f" ({sol_done}/{sol_total} solutions)"
elif batch_matches:
cur_detail = f" ({cur_b}/{total_b} batches)"
Comment thread
coderabbitai[bot] marked this conversation as resolved.
elif cbc_matches:
nodes, secs = cbc_matches[-1]
cur_detail = f" (MIP solver: {int(nodes):,} nodes, {float(secs):.1f}s)"

pipeline_start = step_events[0][3] if step_events else None
end_ts = pipeline_complete_ts or now
total_elapsed = int((end_ts - pipeline_start).total_seconds()) if pipeline_start else 0

step_ts_list = sorted(seen_steps.items())
cur_step_start_ts = seen_steps[last_step_num][1] if last_step_num in seen_steps else None
if not pipeline_complete_ts and cur_step_start_ts:
cur_step_elapsed = int((now - cur_step_start_ts).total_seconds())
if sol_done and sol_total and sol_done > 0:
rate_per_sol = cur_step_elapsed / sol_done
step_remaining = rate_per_sol * (sol_total - sol_done)
elif cur_b is not None and total_b is not None and int(cur_b) > 0 and int(cur_b) < int(total_b):
rate_per_batch = cur_step_elapsed / int(cur_b)
step_remaining = rate_per_batch * (int(total_b) - int(cur_b))

print(f"\nOverall: Puzzletron full pipeline (steps 1–{total_steps})") # noqa: RUF001
print(DIV)
print(f" {'Status':<10} {'Step':<4} {'Description':<34} {'Elapsed':>8}")
print(DIV)

for i, (snum, (sdesc, sts)) in enumerate(step_ts_list):
next_ts = (
step_ts_list[i + 1][1][1] if i + 1 < len(step_ts_list) else (pipeline_complete_ts or now)
)
elapsed = int((next_ts - sts).total_seconds()) if sts and next_ts else None
is_last = snum == last_step_num
is_done = not is_last or pipeline_complete_ts is not None
detail = ""
if is_last and not is_done:
detail = cur_detail
label = f"{snum}/{total_steps}: {sdesc}{detail}"
status = "[DONE]" if is_done else "[RUNNING]"
print(
f" {status:<10} {'':<4} {label:<34} {fmt(elapsed) if elapsed is not None else '—':>8}"
)

for snum in range(last_step_num + 1, total_steps + 1):
print(f" {'[ ]':<10} {'':<4} {f'{snum}/{total_steps}: pending':<34} {'':>8}")

print(DIV)
done_steps = len([s for s in seen_steps if s != last_step_num or pipeline_complete_ts])
step_durations = []
for i, (snum, (sdesc, sts)) in enumerate(step_ts_list):
next_ts = (
step_ts_list[i + 1][1][1] if i + 1 < len(step_ts_list) else (pipeline_complete_ts or None)
)
if next_ts and sts:
step_durations.append(int((next_ts - sts).total_seconds()))
avg_step_s = sum(step_durations) / len(step_durations) if step_durations else None


def step_est(snum):
"""Estimate duration in seconds for a pending pipeline step."""
if snum == 7:
# Step 7 in the full pipeline is a single MIP solve (~5m), not a sweep
return 296
elif snum == 8:
return 60
return avg_step_s or 0


if pipeline_complete_ts:
est_rem = "done"
elif step_remaining is not None:
future_s = sum(step_est(s) for s in range(last_step_num + 1, total_steps + 1))
est_rem = fmt(step_remaining + future_s)
else:
cur_s = step_est(last_step_num)
future_s = cur_s + sum(step_est(s) for s in range(last_step_num + 1, total_steps + 1))
est_rem = fmt(future_s) if (cur_s or future_s) else "calculating..."

finished_str = (
pipeline_complete_ts.strftime("%H:%M:%S")
if pipeline_complete_ts
else now.strftime("%H:%M:%S") + " (in progress)"
)
print(f" Started: {pipeline_start.strftime('%H:%M:%S') if pipeline_start else '—'}")
print(f" Finished: {finished_str}")
print(f" Elapsed: {fmt(total_elapsed)}")
print(f" Completed: {done_steps}/{total_steps} steps")
print(f" Remaining: {est_rem} estimated")
results_match = re.search(r"Results written to: (\S+)", text)
if not results_match:
results_match = re.search(r"\[run_puzzle\.py:335\]\s+(\S+)", text)
if results_match:
print(f"\n Results: {results_match.group(1)}")
Loading
Loading