Skip to content

feat: E_PT2 energy correction for HI-NQS (Issue #35 Tier 1 #2)#37

Draft
thc1006 wants to merge 10 commits intoQuantumNoLab:mainfrom
thc1006:feat/e-pt2-correction
Draft

feat: E_PT2 energy correction for HI-NQS (Issue #35 Tier 1 #2)#37
thc1006 wants to merge 10 commits intoQuantumNoLab:mainfrom
thc1006:feat/e-pt2-correction

Conversation

@thc1006
Copy link
Copy Markdown
Member

@thc1006 thc1006 commented Apr 4, 2026

feat: E_PT2 EN-PT2 能量修正(Issue #35 Tier 1 #2

變更內容

  • compute_e_pt2:新增 EN-PT2 能量修正函式(_pt2_helpers.py
    • E_PT2 = Σ_{x∉V} |⟨x|H|Ψ₀⟩|² / (E₀ - H_xx)
    • 使用 full variational basis + 配對的 eigenvector/energy
  • HINQSSQDConfig.compute_pt2_correction:新增 opt-in 欄位(default False
  • 整合到 run_hi_nqs_sqd:啟用時在最終 iteration 後做 full-basis diag + E_PT2 計算
    • metadata 包含 e_pt2pt2_e0corrected_energypt2_wall_time
  • use_ibm_solver 改為 tri-stateNone=auto、True=強制啟用、False=強制停用
  • CIPSI _SPARSE_DIAG_THRESHOLD:10K → 8K(default params 下可觸發 sparse path)
  • 修正 test_initial_basis.py:patch _IBM_SQD_AVAILABLE 避免 mock hamiltonian crash

Benchmark 結果(CPU,Numba + IBM solver)

系統 E_var (Ha) E_PT2 (mHa) E_corrected (Ha) Basis PT2 time
C₂H₂ 24Q -75.894 -135 -76.029 6,309 25.6s
N₂ 40Q -109.049 -190 -109.240 4,786 153s

E_PT2 修正量較大(135-190 mHa),表示 NQS basis 仍不完整。結合 Tier 1 #1(H-connection expansion)可大幅縮小。

GPU 對比(DGX Spark GB10)

系統 CPU total GPU total 備註
C₂H₂ 86s 64s GPU 快 1.4x
N₂ 40Q 499s 1756s GPU 慢 3.5x(IBM solver CPU-only 瓶頸)

結論:當前架構下 CPU 更快。GPU 加速需替換 IBM solver(Issue #35 Tier 3)。

下一步加速路線(調研結果)

Phase 方案 加速 時間
0 qiskit-addon-dice-solver(CPU,30-100 orb) 演算法提升 1-2 天
1a AMD-HPC/amd-sbd subprocess(GB200 已驗證) 95x 1-2 週
1b CuPy eigsh + custom sigma vector 10-30x 2-4 週
2 sbd nanobind bindings 95x(零拷貝) 2-3 月

TDD 測試

  • 10 個新測試(5 compute_e_pt2 + 5 integration)
  • 461 全套通過,0 失敗

Checklist

  • TDD RED → GREEN → REFACTOR
  • compute_e_pt2 standalone 函式 + 5 tests
  • HINQSSQDConfig.compute_pt2_correction 欄位
  • Full-basis diag for E_PT2(不是 batch sub-sample)
  • use_ibm_solver tri-state 修正
  • CIPSI sparse threshold 修正
  • Code review 修正(5 個問題)
  • CPU + GPU benchmark
  • 461 tests passed

Co-authored-by: leo07010 leo07010@gmail.com

thc1006 and others added 6 commits April 3, 2026 16:14
… Tier 1 #2)

E_PT2 = Σ_{x∉V} ⟨x|H|Ψ₀⟩² / (E₀ - H_xx)

Standalone pure function in _pt2_helpers.py with 5 TDD tests:
- returns float, negative from HF reference, corrected energy closer
  to FCI, full basis gives zero
- Add compute_pt2_correction field to HINQSSQDConfig (default False)
- When enabled, metadata includes e_pt2 and corrected_energy
- Size guard prevents crash when coeffs/basis mismatch
- 43 tests passing (8 new for E_PT2)

Co-authored-by: leo07010 <leo07010@gmail.com>
Tests use mock hamiltonian without integrals; IBM solver auto-detect
tries to access hamiltonian.integrals.h1e causing AttributeError.
- compute_e_pt2 now receives cumulative_basis + full-basis eigenvector
  (not prev_batch_configs which is a random sub-sample)
- E₀ for PT2 denominator comes from same diag as Ψ₀ (consistency)
- Removed unused external_hxx dict in compute_e_pt2
- Fix docstring: |⟨x|H|Ψ₀⟩|² (abs-squared), clarify real-valued note
- Emphasize basis must be FULL variational space in docstring
- Remove unused basis_coeff_map from compute_e_pt2
- Add pt2_e0 and pt2_wall_time to metadata for traceability
- corrected_energy uses pt2_e0 (from full-basis diag), not best_energy
- test_metadata_contains_pt2_e0_and_wall_time: verify traceability
- test_corrected_energy_uses_pt2_e0: verify corrected = pt2_e0 + e_pt2
  (not best_energy + e_pt2)
- CHANGELOG: E_PT2 correction, compute_pt2_correction config, tri-state use_ibm_solver
- api_reference: compute_e_pt2 function signature and description
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds optional Epstein–Nesbet PT2 (EN-PT2) energy correction to the HI+NQS+SQD workflow to produce a post-variational corrected energy, along with unit/integration tests to validate the correction behavior and metadata wiring.

Changes:

  • Introduced compute_e_pt2() to compute EN-PT2 energy corrections over external determinants.
  • Added compute_pt2_correction (opt-in) to HINQSSQDConfig and integrated PT2 correction + metadata into run_hi_nqs_sqd.
  • Expanded tests for compute_e_pt2 and end-to-end PT2 integration; patched IBM SQD availability in initial-basis tests.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 4 comments.

File Description
src/qvartools/methods/nqs/_pt2_helpers.py Adds compute_e_pt2() implementation for EN-PT2 correction.
src/qvartools/methods/nqs/hi_nqs_sqd.py Adds config flag + optional full-basis diag and PT2 correction metadata wiring.
tests/test_methods/test_pt2_selection.py Adds unit tests for compute_e_pt2 and integration tests for PT2 correction in run_hi_nqs_sqd.
tests/test_methods/test_initial_basis.py Patches _IBM_SQD_AVAILABLE to stabilize tests when mocking the solver.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/qvartools/methods/nqs/hi_nqs_sqd.py Outdated
Comment thread src/qvartools/methods/nqs/hi_nqs_sqd.py Outdated
Comment thread src/qvartools/methods/nqs/_pt2_helpers.py Outdated
Comment thread src/qvartools/methods/nqs/_pt2_helpers.py Outdated
1. Docstring: corrected_energy = pt2_e0 + e_pt2 (not energy + e_pt2)
2. Dense diag threshold: 50K → 8K (align with CIPSI _SPARSE_DIAG_THRESHOLD)
3. Memory leak: connections[j].detach().clone() to avoid retaining full tensor
4. Intruder state: clamp near-zero denominator instead of skipping
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/qvartools/methods/nqs/_pt2_helpers.py
Comment thread src/qvartools/methods/nqs/_pt2_helpers.py Outdated
1. Raise ValueError if coeffs.shape[0] != basis.shape[0] (fail-fast)
2. Use diagonal_elements_batch() instead of per-config diagonal_element()
   (faster + frees config tensors after batched computation)
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/qvartools/methods/nqs/_pt2_helpers.py Outdated
Comment thread src/qvartools/methods/nqs/_pt2_helpers.py
Comment thread src/qvartools/methods/nqs/_pt2_helpers.py Outdated
1. Docstring: add diagonal_elements_batch to required Hamiltonian API
2. Store external configs on CPU (.detach().cpu().clone()) to reduce GPU OOM risk
3. Chunked diagonal computation (batch_size=4096) to bound peak memory
@George930502
Copy link
Copy Markdown
Collaborator

Code review

No issues found. Checked for bugs and CLAUDE.md compliance.

🤖 Generated with Claude Code

@thc1006 thc1006 marked this pull request as draft April 5, 2026 19:29
@thc1006
Copy link
Copy Markdown
Member Author

thc1006 commented Apr 5, 2026

張人魚說這先不要合併

@George930502
Copy link
Copy Markdown
Collaborator

OK

thc1006 added a commit to thc1006/qvartools that referenced this pull request Apr 7, 2026
Adds the four NQS methods from src/qvartools/methods/nqs/ as first-class
runnable pipeline catalog entries, alongside the existing 001-009 ablation
groups. Follows the convention "one method = one folder, variants = multiple
.py inside the folder, one multi-section YAML per method".

New pipelines (7 scripts in 4 folders):

- 010_hi_nqs_sqd/  (HI+NQS+SQD, iterative, optional PT2/IBM)
  * default.py    — baseline, auto IBM detect
  * pt2.py        — use_pt2_selection=True + temperature annealing
  * ibm_off.py    — use_ibm_solver=False (force GPU fallback)

- 011_hi_nqs_skqd/ (HI+NQS+SKQD, iterative + Krylov expansion)
  * default.py    — baseline (use_ibm_solver=False)
  * ibm_on.py     — use_ibm_solver=True + S-CORE recovery

- 012_nqs_sqd/    (two-stage NQS+SQD, no iteration)
  * default.py

- 013_nqs_skqd/   (two-stage NQS+SKQD with Krylov expansion)
  * default.py

Each YAML has one section per variant. Each wrapper script is ~80 LoC, uses
the existing experiments/config_loader.py for CLI+YAML, calls the
qvartools.methods.nqs.run_*() function, and prints results in the format
that run_all_pipelines.py's regex parsers expect (Best energy, Final energy,
Wall time).

run_all_pipelines.py PIPELINES list extended from 26 to 33 entries.

Bonus fix discovered during smoke testing: nqs_sqd.py and nqs_skqd.py used
mol_info["n_orbitals"] directly, but get_molecule() does NOT populate that
key — it must come from hamiltonian.integrals. The HI methods avoided this
via extract_orbital_counts(); the simple methods did not. Fixed by routing
both simple methods through extract_orbital_counts() too. This is a
pre-existing bug that nothing was exercising before; now fixed and verified.

Smoke tests on H2 (CPU):
- 010 hi_nqs_sqd/default:   E=-1.1372838345 Ha (0.0 mHa error)  10.5s
- 011 hi_nqs_skqd/default:  E=-1.1372838345 Ha (0.0 mHa error)  10.6s
- 012 nqs_sqd/default:      E=-1.1167593074 Ha (20.5 mHa error) 74.6s  (HF, expected — pure NQS ablation)
- 013 nqs_skqd/default:     E=-1.1372838345 Ha (0.0 mHa error)  44.4s

All 52 tests in tests/test_methods/ + tests/test_run_all_pipelines.py pass.
ruff check + ruff format clean.

Note: 010_hi_nqs_sqd/correction.py (compute_pt2_correction=True variant) is
intentionally deferred — that config field only exists on the
feat/e-pt2-correction branch (PR QuantumNoLab#37). Will add as a follow-up commit after
PR QuantumNoLab#37 merges to main and this branch rebases.
thc1006 added a commit that referenced this pull request Apr 15, 2026
…peline entries (#39)

* docs(adr-003): cross-reference Issue #38 and correct amd-sbd speedup

- Added "Update 2026-04-07" section noting that AMD-HPC/amd-sbd's 95x is
  measured on MI250X (AMD GPU), not GB200 (which is only 2.64x in the same
  paper). This significantly weakens the case for the C++/sbd path on NVIDIA
  hardware.
- Added cross-references to Issue #38 (CuPy factored-space Davidson, parallel
  path), and to memory/project_phase2b_davidson_commitment.md.
- Documented that Phase 2b proper Davidson is non-negotiable for multireference
  chemistry (Cr2, dissociation, open-shell) regardless of which backend wins.

* refactor(pipelines): rename folders 01-09 to 001-009 (3-digit prefix)

Renames the 9 existing pipeline group folders and their YAML configs from
2-digit to 3-digit prefix to leave room for the upcoming 010+ method-as-pipeline
catalog (Issue: standardized open-source benchmark catalog).

Changes:
- git mv all 9 pipeline folders: 01_dci → 001_dci, ..., 09_vqe → 009_vqe
- git mv all 9 YAML configs to match
- Update first-line comment in each YAML to use new prefix
- Update PIPELINES list in run_all_pipelines.py (26 entries)
- Update --skip-iterative check from ("06","07","08") to ("006","007","008")
- Update --skip-quantum check from "09_vqe" to "009_vqe"
- Update --only help text and docstring example to 3-digit
- Update tests/test_run_all_pipelines.py (4 references to 09_vqe)

This is a pure rename + reference update. No behaviour change. All 3 tests in
test_run_all_pipelines.py still pass.

Doc updates (AGENTS.md, README.md, RST files) are scheduled for Phase 4 of
this refactor in a separate commit.

* fix: patch _IBM_SQD_AVAILABLE in initial_basis tests

Tests use mock hamiltonian without integrals; IBM solver auto-detect
tries to access hamiltonian.integrals.h1e causing AttributeError.

* refactor(methods/nqs): extract shared helpers and add METHODS_REGISTRY

Light additive refactor of the four NQS methods to remove duplication and
expose a registry that the upcoming 010+ pipeline wrappers (and any future
benchmark catalog tooling) can use to dispatch by method id.

Changes:
- Add src/qvartools/methods/nqs/_shared.py with three helpers extracted
  from the four method runners (build_autoregressive_nqs,
  extract_orbital_counts, validate_initial_basis). Bit-equivalent to
  the inlined code that was there before.
- Update nqs_sqd.py, nqs_skqd.py, hi_nqs_sqd.py, hi_nqs_skqd.py to use
  the new helpers. No behaviour change, no API change, no signature
  change.
- Add METHODS_REGISTRY dict to methods/nqs/__init__.py mapping
  {nqs_sqd, nqs_skqd, hi_nqs_sqd, hi_nqs_skqd} -> run_fn, config_cls,
  metadata flags (iterative/has_krylov_expansion/has_ibm_solver/
  has_pt2_selection/supports_initial_basis), description, and
  pipeline_folder.

Verified:
- All 49 tests in tests/test_methods/ pass (was 46 + 3 cherry-picked
  failures from main; cherry-pick of 3f39fe2 fixed the 3).
- 62 tests across test_run_all_pipelines + test_methods + test_nqs/
  test_adapters all pass.
- Zero changes to algorithmic logic, exception types, return shapes,
  or function signatures.

* feat(pipelines): add 010-013 method-as-pipeline catalog (NQS methods)

Adds the four NQS methods from src/qvartools/methods/nqs/ as first-class
runnable pipeline catalog entries, alongside the existing 001-009 ablation
groups. Follows the convention "one method = one folder, variants = multiple
.py inside the folder, one multi-section YAML per method".

New pipelines (7 scripts in 4 folders):

- 010_hi_nqs_sqd/  (HI+NQS+SQD, iterative, optional PT2/IBM)
  * default.py    — baseline, auto IBM detect
  * pt2.py        — use_pt2_selection=True + temperature annealing
  * ibm_off.py    — use_ibm_solver=False (force GPU fallback)

- 011_hi_nqs_skqd/ (HI+NQS+SKQD, iterative + Krylov expansion)
  * default.py    — baseline (use_ibm_solver=False)
  * ibm_on.py     — use_ibm_solver=True + S-CORE recovery

- 012_nqs_sqd/    (two-stage NQS+SQD, no iteration)
  * default.py

- 013_nqs_skqd/   (two-stage NQS+SKQD with Krylov expansion)
  * default.py

Each YAML has one section per variant. Each wrapper script is ~80 LoC, uses
the existing experiments/config_loader.py for CLI+YAML, calls the
qvartools.methods.nqs.run_*() function, and prints results in the format
that run_all_pipelines.py's regex parsers expect (Best energy, Final energy,
Wall time).

run_all_pipelines.py PIPELINES list extended from 26 to 33 entries.

Bonus fix discovered during smoke testing: nqs_sqd.py and nqs_skqd.py used
mol_info["n_orbitals"] directly, but get_molecule() does NOT populate that
key — it must come from hamiltonian.integrals. The HI methods avoided this
via extract_orbital_counts(); the simple methods did not. Fixed by routing
both simple methods through extract_orbital_counts() too. This is a
pre-existing bug that nothing was exercising before; now fixed and verified.

Smoke tests on H2 (CPU):
- 010 hi_nqs_sqd/default:   E=-1.1372838345 Ha (0.0 mHa error)  10.5s
- 011 hi_nqs_skqd/default:  E=-1.1372838345 Ha (0.0 mHa error)  10.6s
- 012 nqs_sqd/default:      E=-1.1167593074 Ha (20.5 mHa error) 74.6s  (HF, expected — pure NQS ablation)
- 013 nqs_skqd/default:     E=-1.1372838345 Ha (0.0 mHa error)  44.4s

All 52 tests in tests/test_methods/ + tests/test_run_all_pipelines.py pass.
ruff check + ruff format clean.

Note: 010_hi_nqs_sqd/correction.py (compute_pt2_correction=True variant) is
intentionally deferred — that config field only exists on the
feat/e-pt2-correction branch (PR #37). Will add as a follow-up commit after
PR #37 merges to main and this branch rebases.

* docs(pipelines): update for 3-digit catalog and add 010-013 documentation

Phase 4 of the pipeline catalog refactor: update every doc file that
references the old 0X_ pipeline paths to use the new 3-digit names, and
add a new section documenting the 010-013 method-as-pipeline catalog.

Files updated:
- AGENTS.md
  * directory tree: list all 13 pipeline folders (001-009 + 010-013)
  * config_loader pattern examples use 002_nf_dci paths and 010 wrappers
  * "24 Pipeline Variants" → "33 Pipeline Variants"
- README.md
  * Quick start examples show both 001_dci and 010_hi_nqs_sqd usage
  * --only example uses 3-digit prefixes
- docs/experiments_guide.md
  * All Group 01-09 headers renumbered to 001-009
  * All 24 ablation script paths updated to new prefixes
  * --only example: 01 02 04 → 001 002 004
  * "all 24 pipelines" → "all 33 pipelines"
  * NEW Tier 2 section documenting 010-013 method-as-pipeline catalog
    with per-group variant tables and the numbering convention
- docs/source/user_guide/pipelines.rst
  * Restructured to show "Tier 1: ablation groups (001-009)" and
    "Tier 2: method-as-pipeline catalog (010-013)"
  * Added Tier 2 table with method ids, variants, and METHODS_REGISTRY note
  * Added "Numbering convention" subsection (001-009 / 010-099 / 100-199 / 200+)
  * Updated iterative pipeline code example (mol_info workaround removed —
    extract_orbital_counts handles it now)
  * CLI examples updated to 3-digit
- docs/source/user_guide/yaml_configs.rst
  * Tier 1 / Tier 2 split for the YAML file table
  * Added 009_vqe.yaml + the 4 new 010-013 multi-section YAMLs
  * CLI examples updated to 3-digit
- docs/source/tutorials/h2_pipeline.rst
- docs/source/tutorials/beh2_pipeline.rst
  * CLI examples updated to 3-digit
  * h2 tutorial: added a 010-013 example
- docs/decisions/002-eliminate-torch-numpy-roundtrips.md
  * Updated the validation script paths and "24/24" → "33/33"

Verified: grep for 0[1-9]_(dci|nf_dci|nf_only|hf_only|iterative_nqs|vqe)
in the live tree (excluding .claude/worktrees) returns zero matches.

* fix: address 10 Copilot review findings on PR #39

Bug 1 — experiments/pipelines/run_all_pipelines.py:376
    --skip-iterative was only skipping groups 006/007/008 but 010_hi_nqs_sqd
    and 011_hi_nqs_skqd are also iterative (HI methods). Added 010/011 to
    the prefix check. Verified: all 5 iterative groups now listed in skipped
    set when --skip-iterative is active.

Bugs 2-8 — 7 wrapper scripts at line 56 (YAML section precedence)
    load_config() mutates args.device and args.molecule with merged
    defaults (line 125 of config_loader.py: setattr for every merged key),
    so the wrapper's `args.device or section.get("device")` short-circuited
    on args.device="auto" and silently ignored the YAML section's device
    value. Fixed by importing _get_explicit_cli_args from config_loader
    and only using the CLI arg when the user explicitly typed --device on
    the command line. Same fix applied to --molecule (positional). Verified:
    running with --device cpu correctly overrides section; running without
    --device correctly picks up the section value (auto→cuda detection).

Nit 9 — src/qvartools/methods/nqs/__init__.py docstring
    Docstring claimed METHODS_REGISTRY was "used by the 010-013 wrapper
    scripts so they can dispatch by name without hard-importing each method
    module", but the wrappers were doing direct imports. Fixed architecturally:
    all 7 wrappers now dispatch via METHODS_REGISTRY[METHOD_KEY] and only
    import METHODS_REGISTRY (not individual config/run pairs). The __init__.py
    itself still imports all four methods because the registry needs
    callable references — that's a Python fundamental, not a design flaw.

Nit 10 — src/qvartools/methods/nqs/_shared.py docstring
    Docstring said "behaviour-preserving extractions only — no API changes",
    but extract_orbital_counts adds the mol_info→hamiltonian.integrals
    fallback to nqs_sqd/nqs_skqd, which fixes a pre-existing bug in those
    runners (they were end-to-end broken because they accessed
    mol_info["n_orbitals"] directly but get_molecule doesn't populate it).
    Updated docstring to distinguish the three scopes: (1) build_autoregressive_nqs
    = pure refactor, (2) validate_initial_basis = pure refactor, (3)
    extract_orbital_counts = pure refactor for HI methods + bug fix for
    simple methods.

Testing:
- 72 tests pass (test_methods + test_run_all_pipelines + test_config_loader)
  — up from 62 in the previous commit (test_config_loader added to the
  test surface this round)
- ruff check + ruff format clean
- Smoke tested 010_hi_nqs_sqd/default.py h2: reaches FCI exactly (0.0 mHa)
- Smoke tested 012_nqs_sqd/default.py h2 both with --device cpu
  (CLI override works) and without --device (section default picked up,
  auto→cuda detection)
- --skip-iterative correctly lists all 5 iterative groups
  (006,007,008,010,011) in the skipped set

Out of scope (pre-existing, not caused by this PR):
- 007_iterative_nqs_dci/iter_nqs_dci_sqd.py and iter_nqs_dci_krylov_classical.py
  fail with "initial_basis must be integer or bool dtype ... got torch.float32".
  Verified these same scripts fail identically on main (stash-checkout-unstash
  test). Pre-existing bug in the 007 group's handling of initial_basis.
  Should be filed as a separate issue; not touched in this PR.

* refactor+docs: address Copilot round-2 finding + self-review P0/P1/P2 items

Option C: fix all 8 items from my own deep review on PR #39 plus Copilot's
one remaining round-2 finding, then re-ran the full smoke + lint + test net
and did a second ultrathink code review before commit.

1. docs/source/user_guide/pipelines.rst  (Copilot round 2)
   Tier 2 table rows now use the actual on-disk folder names with
   underscore and monospace formatting (``010_hi_nqs_sqd`` etc.), not
   the human-facing "010 hi_nqs_sqd" form. Added a new
   "Programmatic dispatch via ``METHODS_REGISTRY``" subsection
   documenting all 9 registry fields + an idiomatic code example.
   Rewrote the Iterative Pipelines example to show the registry
   dispatch pattern (matches how the 010-013 wrappers now work) while
   noting that direct imports still work.

2. CHANGELOG.md  (P0: user-visible breaking changes were undocumented)
   Added entries under Changed (2x BREAKING for rename + --only),
   Added (Tier 2 catalog, METHODS_REGISTRY, _shared.py, public
   get_explicit_cli_args), and Fixed (nqs_sqd/nqs_skqd mol_info bug
   rediscovered via smoke test).

3. experiments/pipelines/run_all_pipelines.py  (P0: --only silent failure)
   Detect pre-2026-04-07 2-digit --only format and print a migration
   warning to stderr with the recommended 3-digit form. "--only 01 02"
   now emits "use --only 001 002 instead" instead of silently running
   0 pipelines. Verified at runtime.

4. experiments/config_loader.py  (P1: private symbol imported externally)
   Renamed _get_explicit_cli_args -> get_explicit_cli_args (public name
   is now primary). Added _get_explicit_cli_args as a backward-compat
   alias at module bottom so tests/test_config_loader.py keeps working
   without changes. Docstring updated to explain the alias. Internal
   call in load_config() uses the public name.

5-6. 7 wrapper scripts in 010-013 (P1: multi-section YAML silent fallback)
   Regenerated via template with two new module-level helpers:
   - _resolve_section(cfg, variant): extracts the requested section
     from multi-section YAML, warns when falling back to flat cfg if
     OTHER sections exist (indicating the user picked the wrong script
     or the wrong --config file)
   - _build_config_kwargs(section, config_cls, device): filters section
     keys to dataclass fields, warns on unknown keys (catches YAML
     typos like "n_itrations" silently being dropped)
   Also switched to importing get_explicit_cli_args (public name).
   Verified at runtime: wrong-section fallback warns, typo'd YAML key
   warns, default path still reaches FCI exactly on H2.

7. docs/source/user_guide/pipelines.rst  (P2: METHODS_REGISTRY undocumented)
   Covered by #1 above — new subsection documents the registry.

8. docs/source/user_guide/pipelines.rst  (P2: iterative example inconsistent
   with wrapper pattern)
   Covered by #1 above — iterative example now uses registry dispatch.

Verification after all 8 fixes:
- 61 tests pass (tests/test_methods/ + tests/test_run_all_pipelines.py +
  tests/test_config_loader.py -m "not slow and not gpu and not pyscf")
- ruff check src/ tests/ experiments/ — clean
- ruff format --check src/ tests/ experiments/ — 206 files already formatted
- Runtime smoke tests on H2/CPU:
  * --only 01 02 emits migration warning (verified)
  * multi-section YAML with missing section emits fallback warning (verified)
  * YAML with typo'd key emits unknown-key warning (verified)
  * 010_hi_nqs_sqd/default.py h2 still reaches FCI (0.0 mHa, ~9s)
  * run_all_pipelines.py --only 001 005 012 013 --skip-quantum: 6/6 pass
- All 7 wrappers import cleanly with correct METHOD_KEY/VARIANT constants
- nqs_sqd/nqs_skqd still use extract_orbital_counts (grep confirmed)
- RST heading "Programmatic dispatch via ``METHODS_REGISTRY``" is 46 chars,
  underline is 46 ^ (Sphinx requires >= title length, exact match OK)
- METHODS_REGISTRY 9 fields enumerated in docs match __init__.py definition

Second code review findings (all minor, none blocking):
- _resolve_section + _build_config_kwargs helpers duplicated across 7
  wrapper files (~280 lines). Could extract to shared _wrapper_helpers.py
  module in a follow-up. Not a bug.
- --only warning does not catch single-digit --only 1 2. Low impact.
- _resolve_section's isinstance(v, dict) filter could miss dict-valued
  config fields. No current NQS config has dict fields, so no collision.
- CHANGELOG get_explicit_cli_args entry re-worded to clarify the public
  name is now primary, not the other way around.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants