QuantumNoLab · thc1006 · Apr 15, 2026 · Apr 7, 2026 · Apr 7, 2026 · Apr 3, 2026
diff --git a/AGENTS.md b/AGENTS.md
@@ -331,17 +331,24 @@ qvartools/
 ├── experiments/                  # Reproducible experiment scripts
 │   ├── config_loader.py          # YAML loader with CLI overrides (--config, --device)
 │   ├── profile_pipeline.py       # Wall-clock profiling of pipeline stages
-│   └── pipelines/                # 24 end-to-end pipeline scripts (8 groups × 3 diag modes)
-│       ├── run_all_pipelines.py  # Run all 24 pipelines and compare results
-│       ├── configs/              # 8 YAML config files (one per group)
-│       ├── 01_dci/               # Direct-CI (no NF): classical, quantum, SQD
-│       ├── 02_nf_dci/            # NF + DCI merge: classical, quantum, SQD
-│       ├── 03_nf_dci_pt2/        # NF + DCI + PT2 expansion: classical, quantum, SQD
-│       ├── 04_nf_only/           # NF-only ablation: classical, quantum, SQD
-│       ├── 05_hf_only/           # HF-only baseline: classical, quantum, SQD
-│       ├── 06_iterative_nqs/     # Iterative NQS: classical, quantum, SQD
-│       ├── 07_iterative_nqs_dci/ # NF+DCI merge → iterative NQS: classical, quantum, SQD
-│       └── 08_iterative_nqs_dci_pt2/  # NF+DCI+PT2 → iterative NQS: classical, quantum, SQD
+│   └── pipelines/                # 33 end-to-end pipeline scripts (3-digit prefix catalog)
+│       ├── run_all_pipelines.py  # Run all 33 pipelines and compare results
+│       ├── configs/              # 13 YAML configs (9 ablation + 4 method-as-pipeline)
+│       │
+│       ├── 001_dci/              # Direct-CI (no NF): classical, quantum, SQD
+│       ├── 002_nf_dci/           # NF + DCI merge: classical, quantum, SQD
+│       ├── 003_nf_dci_pt2/       # NF + DCI + PT2 expansion: classical, quantum, SQD
+│       ├── 004_nf_only/          # NF-only ablation: classical, quantum, SQD
+│       ├── 005_hf_only/          # HF-only baseline: classical, quantum, SQD
+│       ├── 006_iterative_nqs/    # Iterative NQS: classical, quantum, SQD
+│       ├── 007_iterative_nqs_dci/    # NF+DCI merge → iterative NQS
+│       ├── 008_iterative_nqs_dci_pt2/  # NF+DCI+PT2 → iterative NQS
+│       ├── 009_vqe/              # CUDA-QX VQE: UCCSD, ADAPT-VQE
+│       │
+│       ├── 010_hi_nqs_sqd/       # HI+NQS+SQD: default, pt2, ibm_off
+│       ├── 011_hi_nqs_skqd/      # HI+NQS+SKQD: default, ibm_on
+│       ├── 012_nqs_sqd/          # NQS+SQD: default
+│       └── 013_nqs_skqd/         # NQS+SKQD: default
 │
 ├── docs/                         # Documentation
 │   ├── architecture.md           # Design philosophy, module dependency graph
@@ -599,18 +606,22 @@ pytest --cov=qvartools --cov-report=term-missing
 
 ### Config Loader Pattern
 
-All 24 pipeline scripts use the shared `config_loader.py`:
+All 33 pipeline scripts use the shared `config_loader.py`:
 
 ```bash
-python experiments/pipelines/02_nf_dci/nf_dci_krylov_classical.py h2 --device cuda
+python experiments/pipelines/002_nf_dci/nf_dci_krylov_classical.py h2 --device cuda
 python experiments/pipelines/run_all_pipelines.py h2 --device cuda
-python experiments/pipelines/02_nf_dci/nf_dci_krylov_classical.py lih \
-    --config experiments/pipelines/configs/02_nf_dci.yaml --max-epochs 200
+python experiments/pipelines/002_nf_dci/nf_dci_krylov_classical.py lih \
+    --config experiments/pipelines/configs/002_nf_dci.yaml --max-epochs 200
+
+# New 010-013 method-as-pipeline catalog
+python experiments/pipelines/010_hi_nqs_sqd/default.py h2 --device cuda
+python experiments/pipelines/010_hi_nqs_sqd/pt2.py h2 --device cuda
 ```
 
 **Precedence:** CLI args > YAML file > hardcoded defaults.
 
-### 24 Pipeline Variants (8 Groups × 3 Diag Modes)
+### 33 Pipeline Variants (9 ablation groups + 4 method-as-pipeline groups)
 
 | Group | Basis Source | Classical Krylov | Quantum Krylov | SQD |
 |-------|-------------|-----------------|----------------|-----|

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -8,6 +8,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 ## [Unreleased]
 
 ### Changed
+- **BREAKING**: `experiments/pipelines/` folders renamed from 2-digit to 3-digit prefix
+  (`01_dci` → `001_dci`, ..., `09_vqe` → `009_vqe`). YAML configs in
+  `experiments/pipelines/configs/` renamed to match. Leaves room for the 010-099
+  method-as-pipeline catalog tier.
+- **BREAKING**: `run_all_pipelines.py --only` argument now requires 3-digit group
+  prefixes (e.g., `--only 001 002 004`). Passing 2-digit values like `--only 01 02`
+  silently matched no groups; now emits a migration warning with the recommended
+  3-digit form. Scripts, docs, and `docs/experiments_guide.md` updated.
 - **BREAKING**: Rename `SampleBasedKrylovDiagonalization` to `ClassicalKrylovDiagonalization` (ADR-001)
 - **BREAKING**: Rename `FlowGuidedSKQD` to `FlowGuidedKrylovDiag` (ADR-001)
 - **BREAKING**: Default `subspace_mode` changed from `"skqd"` to `"classical_krylov"`
@@ -17,6 +25,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - `FCISolver._dense_fallback()` returns `None` instead of raising `RuntimeError` for large Hilbert spaces
 
 ### Added
+- **Pipeline catalog Tier 2**: 4 new pipeline folders (`010_hi_nqs_sqd`,
+  `011_hi_nqs_skqd`, `012_nqs_sqd`, `013_nqs_skqd`) wrapping the
+  `qvartools.methods.nqs.*` runners as first-class benchmark catalog entries.
+  Total pipeline scripts: 26 → 33. Each method gets a folder; variants live
+  as separate scripts inside the folder with a multi-section YAML config.
+- `qvartools.methods.nqs.METHODS_REGISTRY`: public dict keyed by method id
+  (`"nqs_sqd"`, `"nqs_skqd"`, `"hi_nqs_sqd"`, `"hi_nqs_skqd"`) mapping to
+  runner function, config class, capability flags, and pipeline folder
+  metadata. Used by 010-013 wrappers and available for benchmark harnesses.
+- `src/qvartools/methods/nqs/_shared.py`: internal module with
+  `build_autoregressive_nqs`, `extract_orbital_counts`,
+  `validate_initial_basis` helpers extracted from the four NQS method
+  modules to remove duplication.
+- `experiments.config_loader.get_explicit_cli_args`: the previously
+  private `_get_explicit_cli_args` is now the public entry point under
+  this name (the leading-underscore alias is kept for backward
+  compatibility with existing callers). Used by the 010-013 wrapper
+  scripts to detect which CLI args were explicitly typed, so YAML
+  section defaults actually apply when `--device`/molecule is omitted.
 - `compute_molecular_integrals` now accepts `cas` and `casci` parameters for CAS active-space reduction
 - 14 new CAS molecules in registry (26 total): N₂-CAS(10,12/15/17/20/26), Cr₂ + variants up to 72Q, Benzene CAS(6,15)
 - IBM `solve_fermion` auto-enabled when `qiskit_addon_sqd` is installed (α×β Cartesian product, dramatically better accuracy)
@@ -48,6 +75,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - S-CORE (`recover_configurations`) from HI-NQS-SQD IBM path — designed for quantum hardware noise, not needed for classical NQS samples (NH₃ 1.5 hr → 5 s)
 
 ### Fixed
+- `nqs_sqd.py` and `nqs_skqd.py` were end-to-end broken: they accessed
+  `mol_info["n_orbitals"]` directly, but `get_molecule()` does not populate
+  that key. Routed through the new `extract_orbital_counts()` helper which
+  falls back to `hamiltonian.integrals` (same fallback logic the HI methods
+  already had). Both runners now smoke-tested on H₂.
 - `TransformerNFSampler._build_nqs()` used wrong parameter name `hidden_dim` instead of `hidden_dims`
 - `hi_nqs_sqd.py` passed tensors instead of numpy arrays to `vectorized_dedup`
 - Groups 07/08 pipelines discarded NF+DCI basis when calling iterative NQS solvers (Issue #10)

diff --git a/README.md b/README.md
@@ -100,18 +100,25 @@ print(f"Energy: {results['final_energy']:.10f} Ha")
 ### Running experiment pipelines
 
 ```bash
-# Run a single pipeline on H2
-python experiments/pipelines/01_dci/dci_krylov_classical.py h2 --device cuda
+# Run a single ablation pipeline on H2
+python experiments/pipelines/001_dci/dci_krylov_classical.py h2 --device cuda
 
-# Run all 24 pipelines and compare
+# Run a method-as-pipeline benchmark (010+ catalog)
+python experiments/pipelines/010_hi_nqs_sqd/default.py h2 --device cuda
+python experiments/pipelines/010_hi_nqs_sqd/pt2.py h2 --device cuda
+
+# Run all 33 pipelines and compare
 python experiments/pipelines/run_all_pipelines.py h2 --device cuda
 
+# Filter by group prefix (3-digit)
+python experiments/pipelines/run_all_pipelines.py h2 --only 001 005 010
+
 # Skip quantum or iterative pipelines for faster validation
 python experiments/pipelines/run_all_pipelines.py h2 --skip-quantum --skip-iterative
 
 # Run with a YAML config override
-python experiments/pipelines/02_nf_dci/nf_dci_krylov_classical.py lih \
-    --config experiments/pipelines/configs/02_nf_dci.yaml --max-epochs 200
+python experiments/pipelines/002_nf_dci/nf_dci_krylov_classical.py lih \
+    --config experiments/pipelines/configs/002_nf_dci.yaml --max-epochs 200
 ```
 
 ## Package Architecture

diff --git a/docs/decisions/002-eliminate-torch-numpy-roundtrips.md b/docs/decisions/002-eliminate-torch-numpy-roundtrips.md
@@ -282,12 +282,12 @@ ruff format --check src/ tests/ experiments/
 # 5. End-to-end pipeline validation (CRITICAL)
 cd experiments/pipelines
 python run_all_pipelines.py h2 --device cuda
-# Expected: 24/24 pipelines pass
+# Expected: 33/33 pipelines pass (post-2026-04-07 catalog with 010-013)
 
 # 6. Performance regression check
 # Run the specific pipeline before and after, compare wall time
-python pipelines/01_dci/dci_sqd.py h2 --device cuda   # PR-A
-python pipelines/01_dci/dci_krylov_classical.py h2 --device cuda  # PR-B
+python pipelines/001_dci/dci_sqd.py h2 --device cuda   # PR-A
+python pipelines/001_dci/dci_krylov_classical.py h2 --device cuda  # PR-B
 ```
 
 ---

diff --git a/docs/decisions/003-gpu-native-sbd-integration.md b/docs/decisions/003-gpu-native-sbd-integration.md
@@ -146,6 +146,42 @@ nvcc -std=c++17 -O3 \
 # ARM64: no cross-compilation needed (native on DGX Spark)
 ```
 
+## Update 2026-04-07: parallel CuPy path + correction on speedup interpretation
+
+**Correction to the speedup table above**: re-reading arXiv:2601.16169 carefully,
+AMD-HPC/amd-sbd's 95x is measured on **MI250X (AMD GPU, OpenMP offload)**.
+On **GB200** the same library only achieves **2.64x** — a 36x gap between the
+two numbers in the same paper. The "95x" cannot be naively claimed for any
+NVIDIA target. For DGX Spark GB10 (weaker than GB200 in both bandwidth and
+compute), realistic expectation from amd-sbd is **≤2.64x or less**. This
+significantly narrows the speedup advantage of the C++/sbd path over a
+well-written native Python (CuPy) implementation on NVIDIA hardware.
+
+A pure-CuPy alternative path (factored-space sigma vector + Davidson driver)
+is now tracked in **Issue #38**. Motivations:
+
+1. **No C++ build / MPI strip / sm_121 compilation risk** — pure-Python, runs
+   wherever CuPy works. Eliminates the entire toolchain risk surface from this
+   ADR.
+2. **Davidson with diagonal preconditioner is non-negotiable for multireference
+   chemistry** (Cr₂ CAS(12,18) through CAS(12,36), N₂ dissociation, open-shell).
+   Lanczos has ghost-eigenvalue failure modes for near-degenerate spectra;
+   PySCF uses Davidson for FCI for this reason. **Whichever backend wins (sbd
+   or CuPy), proper Davidson must be implemented before Cr₂ work begins.**
+   This commitment is also recorded in `memory/project_phase2b_davidson_commitment.md`.
+3. **Insurance path**: if sbd nanobind Phase 2 hits compilation blockers
+   (CUDA 13.0 CCCL relocation, MPI strip, sm_121 SASS compatibility), Issue
+   #38 can carry the long-term work independently.
+
+The sbd path described in this ADR remains active — Issue #38 is **parallel,
+not a replacement**. Re-evaluate priorities once both Phase 1 prototypes have
+measured numbers on actual DGX Spark GB10 hardware.
+
+**Cross-references added 2026-04-07**:
+- Issue #38: CuPy factored-space Davidson tracking (parallel path)
+- `memory/project_phase2b_davidson_commitment.md`: Phase 2b non-negotiable commitment
+- `memory/feedback_gpu_sku_extrapolation.md`: lesson on misreading vendor speedup claims
+
 ## References
 
 - r-ccs-cms/sbd: https://github.com/r-ccs-cms/sbd