Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 27 additions & 16 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -331,17 +331,24 @@ qvartools/
├── experiments/ # Reproducible experiment scripts
│ ├── config_loader.py # YAML loader with CLI overrides (--config, --device)
│ ├── profile_pipeline.py # Wall-clock profiling of pipeline stages
│ └── pipelines/ # 24 end-to-end pipeline scripts (8 groups × 3 diag modes)
│ ├── run_all_pipelines.py # Run all 24 pipelines and compare results
│ ├── configs/ # 8 YAML config files (one per group)
│ ├── 01_dci/ # Direct-CI (no NF): classical, quantum, SQD
│ ├── 02_nf_dci/ # NF + DCI merge: classical, quantum, SQD
│ ├── 03_nf_dci_pt2/ # NF + DCI + PT2 expansion: classical, quantum, SQD
│ ├── 04_nf_only/ # NF-only ablation: classical, quantum, SQD
│ ├── 05_hf_only/ # HF-only baseline: classical, quantum, SQD
│ ├── 06_iterative_nqs/ # Iterative NQS: classical, quantum, SQD
│ ├── 07_iterative_nqs_dci/ # NF+DCI merge → iterative NQS: classical, quantum, SQD
│ └── 08_iterative_nqs_dci_pt2/ # NF+DCI+PT2 → iterative NQS: classical, quantum, SQD
│ └── pipelines/ # 33 end-to-end pipeline scripts (3-digit prefix catalog)
│ ├── run_all_pipelines.py # Run all 33 pipelines and compare results
│ ├── configs/ # 13 YAML configs (9 ablation + 4 method-as-pipeline)
│ │
│ ├── 001_dci/ # Direct-CI (no NF): classical, quantum, SQD
│ ├── 002_nf_dci/ # NF + DCI merge: classical, quantum, SQD
│ ├── 003_nf_dci_pt2/ # NF + DCI + PT2 expansion: classical, quantum, SQD
│ ├── 004_nf_only/ # NF-only ablation: classical, quantum, SQD
│ ├── 005_hf_only/ # HF-only baseline: classical, quantum, SQD
│ ├── 006_iterative_nqs/ # Iterative NQS: classical, quantum, SQD
│ ├── 007_iterative_nqs_dci/ # NF+DCI merge → iterative NQS
│ ├── 008_iterative_nqs_dci_pt2/ # NF+DCI+PT2 → iterative NQS
│ ├── 009_vqe/ # CUDA-QX VQE: UCCSD, ADAPT-VQE
│ │
│ ├── 010_hi_nqs_sqd/ # HI+NQS+SQD: default, pt2, ibm_off
│ ├── 011_hi_nqs_skqd/ # HI+NQS+SKQD: default, ibm_on
│ ├── 012_nqs_sqd/ # NQS+SQD: default
│ └── 013_nqs_skqd/ # NQS+SKQD: default
├── docs/ # Documentation
│ ├── architecture.md # Design philosophy, module dependency graph
Expand Down Expand Up @@ -599,18 +606,22 @@ pytest --cov=qvartools --cov-report=term-missing

### Config Loader Pattern

All 24 pipeline scripts use the shared `config_loader.py`:
All 33 pipeline scripts use the shared `config_loader.py`:

```bash
python experiments/pipelines/02_nf_dci/nf_dci_krylov_classical.py h2 --device cuda
python experiments/pipelines/002_nf_dci/nf_dci_krylov_classical.py h2 --device cuda
python experiments/pipelines/run_all_pipelines.py h2 --device cuda
python experiments/pipelines/02_nf_dci/nf_dci_krylov_classical.py lih \
--config experiments/pipelines/configs/02_nf_dci.yaml --max-epochs 200
python experiments/pipelines/002_nf_dci/nf_dci_krylov_classical.py lih \
--config experiments/pipelines/configs/002_nf_dci.yaml --max-epochs 200

# New 010-013 method-as-pipeline catalog
python experiments/pipelines/010_hi_nqs_sqd/default.py h2 --device cuda
python experiments/pipelines/010_hi_nqs_sqd/pt2.py h2 --device cuda
```

**Precedence:** CLI args > YAML file > hardcoded defaults.

### 24 Pipeline Variants (8 Groups × 3 Diag Modes)
### 33 Pipeline Variants (9 ablation groups + 4 method-as-pipeline groups)

| Group | Basis Source | Classical Krylov | Quantum Krylov | SQD |
|-------|-------------|-----------------|----------------|-----|
Expand Down
32 changes: 32 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]

### Changed
- **BREAKING**: `experiments/pipelines/` folders renamed from 2-digit to 3-digit prefix
(`01_dci` → `001_dci`, ..., `09_vqe` → `009_vqe`). YAML configs in
`experiments/pipelines/configs/` renamed to match. Leaves room for the 010-099
method-as-pipeline catalog tier.
- **BREAKING**: `run_all_pipelines.py --only` argument now requires 3-digit group
prefixes (e.g., `--only 001 002 004`). Passing 2-digit values like `--only 01 02`
silently matched no groups; now emits a migration warning with the recommended
3-digit form. Scripts, docs, and `docs/experiments_guide.md` updated.
- **BREAKING**: Rename `SampleBasedKrylovDiagonalization` to `ClassicalKrylovDiagonalization` (ADR-001)
- **BREAKING**: Rename `FlowGuidedSKQD` to `FlowGuidedKrylovDiag` (ADR-001)
- **BREAKING**: Default `subspace_mode` changed from `"skqd"` to `"classical_krylov"`
Expand All @@ -17,6 +25,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- `FCISolver._dense_fallback()` returns `None` instead of raising `RuntimeError` for large Hilbert spaces

### Added
- **Pipeline catalog Tier 2**: 4 new pipeline folders (`010_hi_nqs_sqd`,
`011_hi_nqs_skqd`, `012_nqs_sqd`, `013_nqs_skqd`) wrapping the
`qvartools.methods.nqs.*` runners as first-class benchmark catalog entries.
Total pipeline scripts: 26 → 33. Each method gets a folder; variants live
as separate scripts inside the folder with a multi-section YAML config.
- `qvartools.methods.nqs.METHODS_REGISTRY`: public dict keyed by method id
(`"nqs_sqd"`, `"nqs_skqd"`, `"hi_nqs_sqd"`, `"hi_nqs_skqd"`) mapping to
runner function, config class, capability flags, and pipeline folder
metadata. Used by 010-013 wrappers and available for benchmark harnesses.
- `src/qvartools/methods/nqs/_shared.py`: internal module with
`build_autoregressive_nqs`, `extract_orbital_counts`,
`validate_initial_basis` helpers extracted from the four NQS method
modules to remove duplication.
- `experiments.config_loader.get_explicit_cli_args`: the previously
private `_get_explicit_cli_args` is now the public entry point under
this name (the leading-underscore alias is kept for backward
compatibility with existing callers). Used by the 010-013 wrapper
scripts to detect which CLI args were explicitly typed, so YAML
section defaults actually apply when `--device`/molecule is omitted.
- `compute_molecular_integrals` now accepts `cas` and `casci` parameters for CAS active-space reduction
- 14 new CAS molecules in registry (26 total): N₂-CAS(10,12/15/17/20/26), Cr₂ + variants up to 72Q, Benzene CAS(6,15)
- IBM `solve_fermion` auto-enabled when `qiskit_addon_sqd` is installed (α×β Cartesian product, dramatically better accuracy)
Expand Down Expand Up @@ -48,6 +75,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- S-CORE (`recover_configurations`) from HI-NQS-SQD IBM path — designed for quantum hardware noise, not needed for classical NQS samples (NH₃ 1.5 hr → 5 s)

### Fixed
- `nqs_sqd.py` and `nqs_skqd.py` were end-to-end broken: they accessed
`mol_info["n_orbitals"]` directly, but `get_molecule()` does not populate
that key. Routed through the new `extract_orbital_counts()` helper which
falls back to `hamiltonian.integrals` (same fallback logic the HI methods
already had). Both runners now smoke-tested on H₂.
- `TransformerNFSampler._build_nqs()` used wrong parameter name `hidden_dim` instead of `hidden_dims`
- `hi_nqs_sqd.py` passed tensors instead of numpy arrays to `vectorized_dedup`
- Groups 07/08 pipelines discarded NF+DCI basis when calling iterative NQS solvers (Issue #10)
Expand Down
17 changes: 12 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,18 +100,25 @@ print(f"Energy: {results['final_energy']:.10f} Ha")
### Running experiment pipelines

```bash
# Run a single pipeline on H2
python experiments/pipelines/01_dci/dci_krylov_classical.py h2 --device cuda
# Run a single ablation pipeline on H2
python experiments/pipelines/001_dci/dci_krylov_classical.py h2 --device cuda

# Run all 24 pipelines and compare
# Run a method-as-pipeline benchmark (010+ catalog)
python experiments/pipelines/010_hi_nqs_sqd/default.py h2 --device cuda
python experiments/pipelines/010_hi_nqs_sqd/pt2.py h2 --device cuda

# Run all 33 pipelines and compare
python experiments/pipelines/run_all_pipelines.py h2 --device cuda

# Filter by group prefix (3-digit)
python experiments/pipelines/run_all_pipelines.py h2 --only 001 005 010

# Skip quantum or iterative pipelines for faster validation
python experiments/pipelines/run_all_pipelines.py h2 --skip-quantum --skip-iterative

# Run with a YAML config override
python experiments/pipelines/02_nf_dci/nf_dci_krylov_classical.py lih \
--config experiments/pipelines/configs/02_nf_dci.yaml --max-epochs 200
python experiments/pipelines/002_nf_dci/nf_dci_krylov_classical.py lih \
--config experiments/pipelines/configs/002_nf_dci.yaml --max-epochs 200
```

## Package Architecture
Expand Down
6 changes: 3 additions & 3 deletions docs/decisions/002-eliminate-torch-numpy-roundtrips.md
Original file line number Diff line number Diff line change
Expand Up @@ -282,12 +282,12 @@ ruff format --check src/ tests/ experiments/
# 5. End-to-end pipeline validation (CRITICAL)
cd experiments/pipelines
python run_all_pipelines.py h2 --device cuda
# Expected: 24/24 pipelines pass
# Expected: 33/33 pipelines pass (post-2026-04-07 catalog with 010-013)

# 6. Performance regression check
# Run the specific pipeline before and after, compare wall time
python pipelines/01_dci/dci_sqd.py h2 --device cuda # PR-A
python pipelines/01_dci/dci_krylov_classical.py h2 --device cuda # PR-B
python pipelines/001_dci/dci_sqd.py h2 --device cuda # PR-A
python pipelines/001_dci/dci_krylov_classical.py h2 --device cuda # PR-B
```

---
Expand Down
36 changes: 36 additions & 0 deletions docs/decisions/003-gpu-native-sbd-integration.md
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,42 @@ nvcc -std=c++17 -O3 \
# ARM64: no cross-compilation needed (native on DGX Spark)
```

## Update 2026-04-07: parallel CuPy path + correction on speedup interpretation

**Correction to the speedup table above**: re-reading arXiv:2601.16169 carefully,
AMD-HPC/amd-sbd's 95x is measured on **MI250X (AMD GPU, OpenMP offload)**.
On **GB200** the same library only achieves **2.64x** — a 36x gap between the
two numbers in the same paper. The "95x" cannot be naively claimed for any
NVIDIA target. For DGX Spark GB10 (weaker than GB200 in both bandwidth and
compute), realistic expectation from amd-sbd is **≤2.64x or less**. This
significantly narrows the speedup advantage of the C++/sbd path over a
well-written native Python (CuPy) implementation on NVIDIA hardware.

A pure-CuPy alternative path (factored-space sigma vector + Davidson driver)
is now tracked in **Issue #38**. Motivations:

1. **No C++ build / MPI strip / sm_121 compilation risk** — pure-Python, runs
wherever CuPy works. Eliminates the entire toolchain risk surface from this
ADR.
2. **Davidson with diagonal preconditioner is non-negotiable for multireference
chemistry** (Cr₂ CAS(12,18) through CAS(12,36), N₂ dissociation, open-shell).
Lanczos has ghost-eigenvalue failure modes for near-degenerate spectra;
PySCF uses Davidson for FCI for this reason. **Whichever backend wins (sbd
or CuPy), proper Davidson must be implemented before Cr₂ work begins.**
This commitment is also recorded in `memory/project_phase2b_davidson_commitment.md`.
3. **Insurance path**: if sbd nanobind Phase 2 hits compilation blockers
(CUDA 13.0 CCCL relocation, MPI strip, sm_121 SASS compatibility), Issue
#38 can carry the long-term work independently.

The sbd path described in this ADR remains active — Issue #38 is **parallel,
not a replacement**. Re-evaluate priorities once both Phase 1 prototypes have
measured numbers on actual DGX Spark GB10 hardware.

**Cross-references added 2026-04-07**:
- Issue #38: CuPy factored-space Davidson tracking (parallel path)
- `memory/project_phase2b_davidson_commitment.md`: Phase 2b non-negotiable commitment
- `memory/feedback_gpu_sku_extrapolation.md`: lesson on misreading vendor speedup claims

## References

- r-ccs-cms/sbd: https://github.com/r-ccs-cms/sbd
Expand Down
Loading
Loading