-
Notifications
You must be signed in to change notification settings - Fork 451
Experimental claude skill for puzzletron algoritgm #1769
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
danielkorzekwa
wants to merge
17
commits into
main
Choose a base branch
from
dkorzekwa/puzzletron_claude_skill
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
17 commits
Select commit
Hold shift + click to select a range
8ab8f7a
Add a dummy puzzletron skill
danielkorzekwa 9a6d716
Add progress comnand for puzzletron mip_sweep
danielkorzekwa 1e79463
update puzzletron readme
danielkorzekwa f99c01f
fix typo in SKILL.md
danielkorzekwa 131296c
Add PYTHONPATH to mip_sweep skill
danielkorzekwa c5c2015
add skill for puzzletron all
danielkorzekwa 1a62799
update progress bar
danielkorzekwa fa197c2
Rename mip_sweep to mip command
danielkorzekwa 03ceee3
Move python scripts for puzzletron claude command to py scripts.
danielkorzekwa 02ec52e
update progress bar
danielkorzekwa dd6fbdb
code clean up
danielkorzekwa c9f1fe1
code clean up
danielkorzekwa afb6a71
add changelog for puzzletron clade skill
danielkorzekwa 37d7132
NameError: unpack batch_matches before if/elif so cur_b and total_b a…
danielkorzekwa 417ae87
dd nproc_per_node integer validation to prevent shell injection in al…
danielkorzekwa 708cf7b
replace hardcoded sweep.py line-number markers with content-based det…
danielkorzekwa ca9d8b4
add set -o pipefail to torchrun pipelines so torchrun failures are no…
danielkorzekwa File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,93 @@ | ||
| # Puzzletron Agent Skill | ||
|
|
||
| Puzzletron is an end-to-end workflow for model pruning and MIP-based architecture optimization. | ||
| This skill exposes it as a slash command for AI coding agents. | ||
|
|
||
| For full environment setup, model configuration, and algorithm details see | ||
| [examples/puzzletron/README.md](../../examples/puzzletron/README.md). | ||
|
|
||
| > **Experimental:** AI agent integration is an experimental feature and may change. | ||
|
|
||
| Run `/puzzletron` with no arguments to see available commands. | ||
|
|
||
| ## Running the MIP step | ||
|
|
||
| Start the MIP step by telling the agent how many GPUs per node to use: | ||
|
|
||
| ```text | ||
| /puzzletron mip 4 | ||
| ``` | ||
|
|
||
| Output is streamed live and also written to `./log.txt`. While it runs (or after it finishes), | ||
| check progress with: | ||
|
|
||
| ```text | ||
| /puzzletron mip progress | ||
| ``` | ||
|
|
||
| Example output when complete: | ||
|
|
||
| ```text | ||
| Overall: Puzzletron step 7/8 — MIP sweep (6 compression rates) | ||
| ────────────────────────────────────────────────────────────── | ||
| Status Phase Elapsed | ||
| ────────────────────────────────────────────────────────────── | ||
| [DONE] Prep (teacher memory + rate list) <1s | ||
| [DONE] compression_rate=0.5 3m 52s | ||
| [DONE] compression_rate=0.6 4m 41s | ||
| [DONE] compression_rate=0.7 4m 46s | ||
| [DONE] compression_rate=0.8 3m 55s | ||
| [DONE] compression_rate=0.9 3m 55s | ||
| [DONE] compression_rate=1.0 3m 59s | ||
| ────────────────────────────────────────────────────────────── | ||
| Started: 08:05:30 | ||
| Finished: 08:30:38 | ||
| Elapsed: 25m 8s | ||
| Completed: 6/6 compression rates | ||
| Remaining: done estimated | ||
|
|
||
| Results: /workspace/puzzle_dir/mip_sweep_results.csv | ||
| ``` | ||
|
|
||
| While running, the report shows which rate is active, sub-step detail (MIP solver node count | ||
| or validation batch progress), and an estimated time remaining based on completed rates. | ||
|
|
||
| ## Running the full pipeline | ||
|
|
||
| To run all 8 pipeline steps (not just the MIP sweep): | ||
|
|
||
| ```text | ||
| /puzzletron all 2 | ||
| ``` | ||
|
|
||
| Check progress with: | ||
|
|
||
| ```text | ||
| /puzzletron all progress | ||
| ``` | ||
|
|
||
| Example output while running: | ||
|
|
||
| ```text | ||
| Overall: Puzzletron full pipeline (steps 1–8) | ||
| ──────────────────────────────────────────────────────────────────── | ||
| Status Step Description Elapsed | ||
| ──────────────────────────────────────────────────────────────────── | ||
| [DONE] 1/8: starting puzzletron pipeline 0m 0s | ||
| [DONE] 2/8: converting model to Puzzletron heterogeneous format (single-gpu) 0m 26s | ||
| [DONE] 3/8: scoring pruning activations (multi-gpu) 9m 9s | ||
| [DONE] 4/8: pruning the model and saving pruned checkpoints (single-gpu) 0m 57s | ||
| [DONE] 5/8: building replacement library and subblock statistics (single-gpu) 0m 26s | ||
| [RUNNING] 6/8: calculating one block scores (multi-gpu) (270/352 solutions) 100m 6s | ||
| [ ] 7/8: pending | ||
| [ ] 8/8: pending | ||
| ──────────────────────────────────────────────────────────────────── | ||
| Started: 00:08:50 | ||
| Finished: 01:59:54 (in progress) | ||
| Elapsed: 111m 4s | ||
| Completed: 5/8 steps | ||
| Remaining: 56m 24s estimated | ||
| ``` | ||
|
|
||
| Step 6 progress is tracked via completed `solution_N.json` files on disk for an accurate | ||
| remaining estimate. Step 7 (MIP sweep) shows per-rate progress once it starts. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,94 @@ | ||
| --- | ||
| name: puzzletron | ||
| description: "End-to-end workflow for model pruning and MIP-based optimization. Commands: mip, all. Usage: /puzzletron <command>" | ||
| license: Apache-2.0 | ||
| --- | ||
|
|
||
| # Puzzletron | ||
|
|
||
| ## Routing | ||
|
|
||
| **STEP 1 — Check args before doing anything else. This is MANDATORY.** | ||
|
|
||
| - If args are **empty**, output the block below verbatim and **STOP immediately. Do NOT proceed to any command.** | ||
| - If the first word of args does **not exactly match** `mip` or `all`, output the block below verbatim and **STOP immediately. Do NOT proceed to any command.** | ||
|
|
||
| --- | ||
|
|
||
| **Puzzletron** — end-to-end workflow for model pruning and MIP-based optimization. | ||
|
|
||
| Available commands: | ||
| - `mip <nproc_per_node>` — Run the MIP step (nproc_per_node: number of GPUs per node) | ||
| - `mip progress` — Show live MIP progress with timing summary | ||
| - `all <nproc_per_node>` — Run the full Puzzletron pipeline (nproc_per_node: number of GPUs per node) | ||
| - `all progress` — Show live full pipeline progress with timing summary | ||
|
|
||
| Usage: `/puzzletron <command> [args]` | ||
|
|
||
| --- | ||
|
|
||
| **STEP 2 — Only if the first word of args exactly matches a command name, execute it. Never reach this step if args were empty.** | ||
|
|
||
| ## Command: all | ||
|
|
||
| Parse `nproc_per_node` from args using either positional or flag syntax: | ||
| - Positional: second word is a number, e.g. `all 2` | ||
| - Flag: `--nproc_per_node <value>` anywhere in args, e.g. `all --nproc_per_node 2` | ||
|
|
||
| - If the second word is exactly `progress`, execute the **all progress** sub-command below. | ||
| - If no `nproc_per_node` value can be found, ask the user: "Please provide the number of GPUs per node (nproc_per_node)." and **STOP**. | ||
| - If the value does not match `^[0-9]+$`, ask the user: "nproc_per_node must be a positive integer." and **STOP**. | ||
| - Otherwise use the parsed value and run the full pipeline. | ||
|
|
||
| ### all \<nproc_per_node\> | ||
|
|
||
| Run the following Bash command, substituting `<nproc_per_node>` with the parsed value: | ||
|
|
||
| ```bash | ||
| set -o pipefail && export PYTHONPATH=$PYTHONPATH:/workspace/Model-Optimizer && \ | ||
| torchrun --nproc_per_node <nproc_per_node> examples/puzzletron/main.py \ | ||
| --config examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yaml \ | ||
| 2>&1 | tee ./log.txt | grep "Puzzletron Progress" | ||
| ``` | ||
|
|
||
| Stream output to the user as it arrives. When the command finishes, report the exit code. | ||
|
coderabbitai[bot] marked this conversation as resolved.
|
||
|
|
||
| ### all progress | ||
|
|
||
| Run the following Bash command. Present the output to the user wrapped in a fenced code block (``` ... ```). | ||
|
|
||
| ```bash | ||
| python3 .agents/skills/puzzletron/all_progress.py | ||
| ``` | ||
|
|
||
| ## Command: mip | ||
|
|
||
| Parse `nproc_per_node` from args using either positional or flag syntax: | ||
| - Positional: second word is a number, e.g. `mip 2` | ||
| - Flag: `--nproc_per_node <value>` anywhere in args, e.g. `mip --nproc_per_node 2` | ||
|
|
||
| - If the second word is exactly `progress`, execute the **mip progress** sub-command below. | ||
| - If no `nproc_per_node` value can be found, ask the user: "Please provide the number of GPUs per node (nproc_per_node)." and **STOP**. | ||
| - If the value does not match `^[0-9]+$`, ask the user: "nproc_per_node must be a positive integer." and **STOP**. | ||
| - Otherwise use the parsed value and run the MIP step. | ||
|
|
||
| ### mip \<nproc_per_node\> | ||
|
|
||
| Run the following Bash command, substituting `<nproc_per_node>` with the parsed value: | ||
|
|
||
| ```bash | ||
| set -o pipefail && export PYTHONPATH=$PYTHONPATH:/workspace/Model-Optimizer && \ | ||
| torchrun --nproc_per_node <nproc_per_node> examples/puzzletron/main.py \ | ||
| --config examples/puzzletron/configs/llama-3_1-8B_pruneffn_memory/llama-3_1-8B_pruneffn_memory.yaml \ | ||
| --mip-only 2>&1 | tee ./log.txt | grep "Puzzletron Progress" | ||
| ``` | ||
|
|
||
| Stream output to the user as it arrives. When the command finishes, report the exit code. | ||
|
|
||
| ### mip progress | ||
|
|
||
| Run the following Bash command. Present the output to the user wrapped in a fenced code block (``` ... ```). | ||
|
|
||
| ```bash | ||
| python3 .agents/skills/puzzletron/mip_progress.py | ||
| ``` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,174 @@ | ||
| # SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
|
|
||
| # Generated with Claude Code | ||
| """Progress report for the full Puzzletron pipeline (all 8 steps).""" | ||
|
|
||
| import glob | ||
| import re | ||
| import sys | ||
| from datetime import datetime | ||
|
|
||
| LOG = "./log.txt" | ||
| try: | ||
| lines = open(LOG).readlines() | ||
| text = "".join(lines) | ||
| except FileNotFoundError: | ||
| print("No log.txt found. Run /puzzletron all first.") | ||
| sys.exit(0) | ||
|
|
||
|
|
||
| def fmt(s): | ||
| """Format seconds as 'Xm Ys', or '—' if None.""" | ||
| return f"{int(s) // 60}m {int(s) % 60}s" if s is not None else "—" | ||
|
|
||
|
|
||
| def get_ts(line): | ||
| """Extract a datetime from a log line timestamp, or None.""" | ||
| m = re.search(r"\[(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})", line) | ||
| return datetime.strptime(m.group(1), "%Y-%m-%d %H:%M:%S") if m else None | ||
|
|
||
|
|
||
| now = datetime.now().replace(microsecond=0) | ||
| DIV = "─" * 68 | ||
|
|
||
| step_events = [] | ||
| for line in lines: | ||
| m = re.search(r"Puzzletron Progress (\d+)/(\d+): (.+)", line) | ||
| if m: | ||
| step_num = int(m.group(1)) | ||
| total_steps = int(m.group(2)) | ||
| desc = m.group(3).strip() | ||
| ts = get_ts(line) | ||
| step_events.append((step_num, total_steps, desc, ts)) | ||
|
|
||
| total_steps = step_events[-1][1] if step_events else 8 | ||
| seen_steps = {e[0]: (e[2], e[3]) for e in step_events} | ||
| last_step_num = max(seen_steps.keys()) if seen_steps else 0 | ||
|
|
||
| pipeline_complete_ts = None | ||
| if last_step_num == total_steps and total_steps in seen_steps: | ||
| pipeline_complete_ts = seen_steps[total_steps][1] | ||
|
|
||
| cur_detail = "" | ||
| step_remaining = None | ||
| batch_matches = re.findall(r"calculate_losses_pipeline[^:]*:\s*(\d+)%.*?(\d+)/(\d+)", text) | ||
| cbc_matches = re.findall(r"After (\d+) nodes.*?\(([\d.]+) seconds\)", text) | ||
|
|
||
| sol_dir_match = re.search( | ||
| r"'output_dir': '([^']+single_sequence_replacement_solutions--validation[^']*)'", text | ||
| ) | ||
| sol_done, sol_total = None, None | ||
| if sol_dir_match: | ||
| sol_dir = sol_dir_match.group(1) | ||
| sol_done = len(glob.glob(f"{sol_dir}/solution*.json")) | ||
| sol_list_match = re.search(r"'solutions_to_validate': \[([\d, ]+)\]", text) | ||
| if sol_list_match: | ||
| sol_total = len(sol_list_match.group(1).split(",")) | ||
| pct, cur_b, total_b = batch_matches[-1] if batch_matches else (None, None, None) | ||
| if sol_done is not None and sol_total: | ||
| cur_detail = f" ({sol_done}/{sol_total} solutions)" | ||
| elif batch_matches: | ||
| cur_detail = f" ({cur_b}/{total_b} batches)" | ||
|
coderabbitai[bot] marked this conversation as resolved.
|
||
| elif cbc_matches: | ||
| nodes, secs = cbc_matches[-1] | ||
| cur_detail = f" (MIP solver: {int(nodes):,} nodes, {float(secs):.1f}s)" | ||
|
|
||
| pipeline_start = step_events[0][3] if step_events else None | ||
| end_ts = pipeline_complete_ts or now | ||
| total_elapsed = int((end_ts - pipeline_start).total_seconds()) if pipeline_start else 0 | ||
|
|
||
| step_ts_list = sorted(seen_steps.items()) | ||
| cur_step_start_ts = seen_steps[last_step_num][1] if last_step_num in seen_steps else None | ||
| if not pipeline_complete_ts and cur_step_start_ts: | ||
| cur_step_elapsed = int((now - cur_step_start_ts).total_seconds()) | ||
| if sol_done and sol_total and sol_done > 0: | ||
| rate_per_sol = cur_step_elapsed / sol_done | ||
| step_remaining = rate_per_sol * (sol_total - sol_done) | ||
| elif cur_b is not None and total_b is not None and int(cur_b) > 0 and int(cur_b) < int(total_b): | ||
| rate_per_batch = cur_step_elapsed / int(cur_b) | ||
| step_remaining = rate_per_batch * (int(total_b) - int(cur_b)) | ||
|
|
||
| print(f"\nOverall: Puzzletron full pipeline (steps 1–{total_steps})") # noqa: RUF001 | ||
| print(DIV) | ||
| print(f" {'Status':<10} {'Step':<4} {'Description':<34} {'Elapsed':>8}") | ||
| print(DIV) | ||
|
|
||
| for i, (snum, (sdesc, sts)) in enumerate(step_ts_list): | ||
| next_ts = ( | ||
| step_ts_list[i + 1][1][1] if i + 1 < len(step_ts_list) else (pipeline_complete_ts or now) | ||
| ) | ||
| elapsed = int((next_ts - sts).total_seconds()) if sts and next_ts else None | ||
| is_last = snum == last_step_num | ||
| is_done = not is_last or pipeline_complete_ts is not None | ||
| detail = "" | ||
| if is_last and not is_done: | ||
| detail = cur_detail | ||
| label = f"{snum}/{total_steps}: {sdesc}{detail}" | ||
| status = "[DONE]" if is_done else "[RUNNING]" | ||
| print( | ||
| f" {status:<10} {'':<4} {label:<34} {fmt(elapsed) if elapsed is not None else '—':>8}" | ||
| ) | ||
|
|
||
| for snum in range(last_step_num + 1, total_steps + 1): | ||
| print(f" {'[ ]':<10} {'':<4} {f'{snum}/{total_steps}: pending':<34} {'':>8}") | ||
|
|
||
| print(DIV) | ||
| done_steps = len([s for s in seen_steps if s != last_step_num or pipeline_complete_ts]) | ||
| step_durations = [] | ||
| for i, (snum, (sdesc, sts)) in enumerate(step_ts_list): | ||
| next_ts = ( | ||
| step_ts_list[i + 1][1][1] if i + 1 < len(step_ts_list) else (pipeline_complete_ts or None) | ||
| ) | ||
| if next_ts and sts: | ||
| step_durations.append(int((next_ts - sts).total_seconds())) | ||
| avg_step_s = sum(step_durations) / len(step_durations) if step_durations else None | ||
|
|
||
|
|
||
| def step_est(snum): | ||
| """Estimate duration in seconds for a pending pipeline step.""" | ||
| if snum == 7: | ||
| # Step 7 in the full pipeline is a single MIP solve (~5m), not a sweep | ||
| return 296 | ||
| elif snum == 8: | ||
| return 60 | ||
| return avg_step_s or 0 | ||
|
|
||
|
|
||
| if pipeline_complete_ts: | ||
| est_rem = "done" | ||
| elif step_remaining is not None: | ||
| future_s = sum(step_est(s) for s in range(last_step_num + 1, total_steps + 1)) | ||
| est_rem = fmt(step_remaining + future_s) | ||
| else: | ||
| cur_s = step_est(last_step_num) | ||
| future_s = cur_s + sum(step_est(s) for s in range(last_step_num + 1, total_steps + 1)) | ||
| est_rem = fmt(future_s) if (cur_s or future_s) else "calculating..." | ||
|
|
||
| finished_str = ( | ||
| pipeline_complete_ts.strftime("%H:%M:%S") | ||
| if pipeline_complete_ts | ||
| else now.strftime("%H:%M:%S") + " (in progress)" | ||
| ) | ||
| print(f" Started: {pipeline_start.strftime('%H:%M:%S') if pipeline_start else '—'}") | ||
| print(f" Finished: {finished_str}") | ||
| print(f" Elapsed: {fmt(total_elapsed)}") | ||
| print(f" Completed: {done_steps}/{total_steps} steps") | ||
| print(f" Remaining: {est_rem} estimated") | ||
| results_match = re.search(r"Results written to: (\S+)", text) | ||
| if not results_match: | ||
| results_match = re.search(r"\[run_puzzle\.py:335\]\s+(\S+)", text) | ||
| if results_match: | ||
| print(f"\n Results: {results_match.group(1)}") | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.