NVIDIA · danielkorzekwa · Jun 17, 2026 · Jun 17, 2026 · Jun 17, 2026 · Jun 17, 2026
@@ -0,0 +1,93 @@
+# Puzzletron Agent Skill
+
+Puzzletron is an end-to-end workflow for model pruning and MIP-based architecture optimization.
+This skill exposes it as a slash command for AI coding agents.
+
+For full environment setup, model configuration, and algorithm details see
+[examples/puzzletron/README.md](../../examples/puzzletron/README.md).
+
+> **Experimental:** AI agent integration is an experimental feature and may change.
+
+Run `/puzzletron` with no arguments to see available commands.
+
+## Running the MIP step
+
+Start the MIP step by telling the agent how many GPUs per node to use:
+
+```text
+/puzzletron mip 4
+```
+
+Output is streamed live and also written to `./log.txt`. While it runs (or after it finishes),
+check progress with:
+
+```text
+/puzzletron mip progress
+```
+
+Example output when complete:
+
+```text
+Overall: Puzzletron step 7/8 — MIP sweep (6 compression rates)
+──────────────────────────────────────────────────────────────
+  Status      Phase                              Elapsed
+──────────────────────────────────────────────────────────────
+  [DONE]      Prep (teacher memory + rate list)       <1s
+  [DONE]      compression_rate=0.5                3m 52s
+  [DONE]      compression_rate=0.6                4m 41s
+  [DONE]      compression_rate=0.7                4m 46s
+  [DONE]      compression_rate=0.8                3m 55s
+  [DONE]      compression_rate=0.9                3m 55s
+  [DONE]      compression_rate=1.0                3m 59s
+──────────────────────────────────────────────────────────────
+  Started:   08:05:30
+  Finished:  08:30:38
+  Elapsed:   25m 8s
+  Completed: 6/6 compression rates
+  Remaining: done estimated
+
+  Results:   /workspace/puzzle_dir/mip_sweep_results.csv
+```
+
+While running, the report shows which rate is active, sub-step detail (MIP solver node count
+or validation batch progress), and an estimated time remaining based on completed rates.
+
+## Running the full pipeline
+
+To run all 8 pipeline steps (not just the MIP sweep):
+
+```text
+/puzzletron all 2
+```
+
+Check progress with:
+
+```text
+/puzzletron all progress
+```
+
+Example output while running:
+
+```text
+Overall: Puzzletron full pipeline (steps 1–8)
+────────────────────────────────────────────────────────────────────
+  Status      Step  Description                          Elapsed
+────────────────────────────────────────────────────────────────────
+  [DONE]            1/8: starting puzzletron pipeline      0m 0s
+  [DONE]            2/8: converting model to Puzzletron heterogeneous format (single-gpu)    0m 26s
+  [DONE]            3/8: scoring pruning activations (multi-gpu)     9m 9s
+  [DONE]            4/8: pruning the model and saving pruned checkpoints (single-gpu)    0m 57s
+  [DONE]            5/8: building replacement library and subblock statistics (single-gpu)    0m 26s
+  [RUNNING]         6/8: calculating one block scores (multi-gpu) (270/352 solutions)   100m 6s
+  [ ]               7/8: pending
+  [ ]               8/8: pending
+────────────────────────────────────────────────────────────────────
+  Started:   00:08:50
+  Finished:  01:59:54 (in progress)
+  Elapsed:   111m 4s
+  Completed: 5/8 steps
+  Remaining: 56m 24s estimated
+```
+
+Step 6 progress is tracked via completed `solution_N.json` files on disk for an accurate
+remaining estimate. Step 7 (MIP sweep) shows per-rate progress once it starts.
@@ -0,0 +1,94 @@
+---
+name: puzzletron
+description: "End-to-end workflow for model pruning and MIP-based optimization. Commands: mip, all. Usage: /puzzletron <command>"
+license: Apache-2.0
+---
+
+# Puzzletron
+
+## Routing
+
+**STEP 1 — Check args before doing anything else. This is MANDATORY.**
+
+- If args are **empty**, output the block below verbatim and **STOP immediately. Do NOT proceed to any command.**
+- If the first word of args does **not exactly match** `mip` or `all`, output the block below verbatim and **STOP immediately. Do NOT proceed to any command.**
+
+---
+
+**Puzzletron** — end-to-end workflow for model pruning and MIP-based optimization.
+
+Available commands:
+- `mip <nproc_per_node>` — Run the MIP step (nproc_per_node: number of GPUs per node)
+- `mip progress` — Show live MIP progress with timing summary
+- `all <nproc_per_node>` — Run the full Puzzletron pipeline (nproc_per_node: number of GPUs per node)
+- `all progress` — Show live full pipeline progress with timing summary
+
+Usage: `/puzzletron <command> [args]`
+
+---
+
+**STEP 2 — Only if the first word of args exactly matches a command name, execute it. Never reach this step if args were empty.**
+
+## Command: all
+
+Parse `nproc_per_node` from args using either positional or flag syntax:
+- Positional: second word is a number, e.g. `all 2`
+- Flag: `--nproc_per_node <value>` anywhere in args, e.g. `all --nproc_per_node 2`
+
+- If the second word is exactly `progress`, execute the **all progress** sub-command below.
+- If no `nproc_per_node` value can be found, ask the user: "Please provide the number of GPUs per node (nproc_per_node)." and **STOP**.
+- If the value does not match `^[0-9]+$`, ask the user: "nproc_per_node must be a positive integer." and **STOP**.
+- Otherwise use the parsed value and run the full pipeline.
+
+### all \<nproc_per_node\>
+
+Run the following Bash command, substituting `<nproc_per_node>` with the parsed value:
+
+```bash
+set -o pipefail && export PYTHONPATH=$PYTHONPATH:/workspace/Model-Optimizer && \
+torchrun --nproc_per_node <nproc_per_node> examples/puzzletron/main.py \
+  --config examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yaml \
+  2>&1 | tee ./log.txt | grep "Puzzletron Progress"
+```
+
+Stream output to the user as it arrives. When the command finishes, report the exit code.
+
+### all progress
+
+Run the following Bash command. Present the output to the user wrapped in a fenced code block (``` ... ```).
+
+```bash
+python3 .agents/skills/puzzletron/all_progress.py
+```
+
+## Command: mip
+
+Parse `nproc_per_node` from args using either positional or flag syntax:
+- Positional: second word is a number, e.g. `mip 2`
+- Flag: `--nproc_per_node <value>` anywhere in args, e.g. `mip --nproc_per_node 2`
+
+- If the second word is exactly `progress`, execute the **mip progress** sub-command below.
+- If no `nproc_per_node` value can be found, ask the user: "Please provide the number of GPUs per node (nproc_per_node)." and **STOP**.
+- If the value does not match `^[0-9]+$`, ask the user: "nproc_per_node must be a positive integer." and **STOP**.
+- Otherwise use the parsed value and run the MIP step.
+
+### mip \<nproc_per_node\>
+
+Run the following Bash command, substituting `<nproc_per_node>` with the parsed value:
+
+```bash
+set -o pipefail && export PYTHONPATH=$PYTHONPATH:/workspace/Model-Optimizer && \
+torchrun --nproc_per_node <nproc_per_node> examples/puzzletron/main.py \
+  --config examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yaml \
+  --mip-only 2>&1 | tee ./log.txt | grep "Puzzletron Progress"
+```
+
+Stream output to the user as it arrives. When the command finishes, report the exit code.
+
+### mip progress
+
+Run the following Bash command. Present the output to the user wrapped in a fenced code block (``` ... ```).
+
+```bash
+python3 .agents/skills/puzzletron/mip_progress.py
+```
@@ -0,0 +1,174 @@
+# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
+# SPDX-License-Identifier: Apache-2.0
+#
+# Licensed under the Apache License, Version 2.0 (the "License");
+# you may not use this file except in compliance with the License.
+# You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+
+# Generated with Claude Code
+"""Progress report for the full Puzzletron pipeline (all 8 steps)."""
+
+import glob
+import re
+import sys
+from datetime import datetime
+
+LOG = "./log.txt"
+try:
+    lines = open(LOG).readlines()
+    text = "".join(lines)
+except FileNotFoundError:
+    print("No log.txt found. Run /puzzletron all first.")
+    sys.exit(0)
+
+
+def fmt(s):
+    """Format seconds as 'Xm Ys', or '—' if None."""
+    return f"{int(s) // 60}m {int(s) % 60}s" if s is not None else "—"
+
+
+def get_ts(line):
+    """Extract a datetime from a log line timestamp, or None."""
+    m = re.search(r"\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})", line)
+    return datetime.strptime(m.group(1), "%Y-%m-%d %H:%M:%S") if m else None
+
+
+now = datetime.now().replace(microsecond=0)
+DIV = "─" * 68
+
+step_events = []
+for line in lines:
+    m = re.search(r"Puzzletron Progress (\d+)/(\d+): (.+)", line)
+    if m:
+        step_num = int(m.group(1))
+        total_steps = int(m.group(2))
+        desc = m.group(3).strip()
+        ts = get_ts(line)
+        step_events.append((step_num, total_steps, desc, ts))
+
+total_steps = step_events[-1][1] if step_events else 8
+seen_steps = {e[0]: (e[2], e[3]) for e in step_events}
+last_step_num = max(seen_steps.keys()) if seen_steps else 0
+
+pipeline_complete_ts = None
+if last_step_num == total_steps and total_steps in seen_steps:
+    pipeline_complete_ts = seen_steps[total_steps][1]
+
+cur_detail = ""
+step_remaining = None
+batch_matches = re.findall(r"calculate_losses_pipeline[^:]*:\s*(\d+)%.*?(\d+)/(\d+)", text)
+cbc_matches = re.findall(r"After (\d+) nodes.*?\(([\d.]+) seconds\)", text)
+
+sol_dir_match = re.search(
+    r"'output_dir': '([^']+single_sequence_replacement_solutions--validation[^']*)'", text
+)
+sol_done, sol_total = None, None
+if sol_dir_match:
+    sol_dir = sol_dir_match.group(1)
+    sol_done = len(glob.glob(f"{sol_dir}/solution*.json"))
+    sol_list_match = re.search(r"'solutions_to_validate': \[([\d, ]+)\]", text)
+    if sol_list_match:
+        sol_total = len(sol_list_match.group(1).split(","))
+pct, cur_b, total_b = batch_matches[-1] if batch_matches else (None, None, None)
+if sol_done is not None and sol_total:
+    cur_detail = f" ({sol_done}/{sol_total} solutions)"
+elif batch_matches:
+    cur_detail = f" ({cur_b}/{total_b} batches)"
+elif cbc_matches:
+    nodes, secs = cbc_matches[-1]
+    cur_detail = f" (MIP solver: {int(nodes):,} nodes, {float(secs):.1f}s)"
+
+pipeline_start = step_events[0][3] if step_events else None
+end_ts = pipeline_complete_ts or now
+total_elapsed = int((end_ts - pipeline_start).total_seconds()) if pipeline_start else 0
+
+step_ts_list = sorted(seen_steps.items())
+cur_step_start_ts = seen_steps[last_step_num][1] if last_step_num in seen_steps else None
+if not pipeline_complete_ts and cur_step_start_ts:
+    cur_step_elapsed = int((now - cur_step_start_ts).total_seconds())
+    if sol_done and sol_total and sol_done > 0:
+        rate_per_sol = cur_step_elapsed / sol_done
+        step_remaining = rate_per_sol * (sol_total - sol_done)
+    elif cur_b is not None and total_b is not None and int(cur_b) > 0 and int(cur_b) < int(total_b):
+        rate_per_batch = cur_step_elapsed / int(cur_b)
+        step_remaining = rate_per_batch * (int(total_b) - int(cur_b))
+
+print(f"\nOverall: Puzzletron full pipeline (steps 1–{total_steps})")  # noqa: RUF001
+print(DIV)
+print(f"  {'Status':<10}  {'Step':<4}  {'Description':<34}  {'Elapsed':>8}")
+print(DIV)
+
+for i, (snum, (sdesc, sts)) in enumerate(step_ts_list):
+    next_ts = (
+        step_ts_list[i + 1][1][1] if i + 1 < len(step_ts_list) else (pipeline_complete_ts or now)
+    )
+    elapsed = int((next_ts - sts).total_seconds()) if sts and next_ts else None
+    is_last = snum == last_step_num
+    is_done = not is_last or pipeline_complete_ts is not None
+    detail = ""
+    if is_last and not is_done:
+        detail = cur_detail
+    label = f"{snum}/{total_steps}: {sdesc}{detail}"
+    status = "[DONE]" if is_done else "[RUNNING]"
+    print(
+        f"  {status:<10}  {'':<4}  {label:<34}  {fmt(elapsed) if elapsed is not None else '—':>8}"
+    )
+
+for snum in range(last_step_num + 1, total_steps + 1):
+    print(f"  {'[ ]':<10}  {'':<4}  {f'{snum}/{total_steps}: pending':<34}  {'':>8}")
+
+print(DIV)
+done_steps = len([s for s in seen_steps if s != last_step_num or pipeline_complete_ts])
+step_durations = []
+for i, (snum, (sdesc, sts)) in enumerate(step_ts_list):
+    next_ts = (
+        step_ts_list[i + 1][1][1] if i + 1 < len(step_ts_list) else (pipeline_complete_ts or None)
+    )
+    if next_ts and sts:
+        step_durations.append(int((next_ts - sts).total_seconds()))
+avg_step_s = sum(step_durations) / len(step_durations) if step_durations else None
+
+
+def step_est(snum):
+    """Estimate duration in seconds for a pending pipeline step."""
+    if snum == 7:
+        # Step 7 in the full pipeline is a single MIP solve (~5m), not a sweep
+        return 296
+    elif snum == 8:
+        return 60
+    return avg_step_s or 0
+
+
+if pipeline_complete_ts:
+    est_rem = "done"
+elif step_remaining is not None:
+    future_s = sum(step_est(s) for s in range(last_step_num + 1, total_steps + 1))
+    est_rem = fmt(step_remaining + future_s)
+else:
+    cur_s = step_est(last_step_num)
+    future_s = cur_s + sum(step_est(s) for s in range(last_step_num + 1, total_steps + 1))
+    est_rem = fmt(future_s) if (cur_s or future_s) else "calculating..."
+
+finished_str = (
+    pipeline_complete_ts.strftime("%H:%M:%S")
+    if pipeline_complete_ts
+    else now.strftime("%H:%M:%S") + " (in progress)"
+)
+print(f"  Started:   {pipeline_start.strftime('%H:%M:%S') if pipeline_start else '—'}")
+print(f"  Finished:  {finished_str}")
+print(f"  Elapsed:   {fmt(total_elapsed)}")
+print(f"  Completed: {done_steps}/{total_steps} steps")
+print(f"  Remaining: {est_rem} estimated")
+results_match = re.search(r"Results written to: (\S+)", text)
+if not results_match:
+    results_match = re.search(r"\[run_puzzle\.py:335\]\s+(\S+)", text)
+if results_match:
+    print(f"\n  Results:   {results_match.group(1)}")