feat(sync): GPU (torch/Metal/CUDA) and numba backends for all 9 metrics by Ramdam17 · Pull Request #264 · ppsp-team/HyPyP

Ramdam17 · 2026-04-14T19:23:42Z

Summary

Add hardware-accelerated backends to all 9 connectivity metrics in hypyp.sync:

numba JIT (prange): PLV, CCorr, Coh, ImCoh, PLI, wPLI, EnvCorr, PowCorr
PyTorch (MPS/CUDA/CPU, batched einsum): all 9 metrics
Metal compute shaders (Apple Silicon): PLI, wPLI, ACCorr
CUDA raw kernels (CuPy, float64): all 9 metrics

Backend Support Matrix

Metric	numpy	numba	torch	metal	cuda_kernel
PLV	x	x	x	--	x
CCorr	x	x	x	--	x
Coh	x	x	x	--	x
ImCoh	x	x	x	--	x
EnvCorr	x	x	x	--	x
PowCorr	x	x	x	--	x
PLI	x	x	x	x	x
wPLI	x	x	x	x	x
ACCorr	x	x	x	x	x

Benchmark Highlights

Benchmarked on Mac M4 Max (131 runs) and Narval A100 (111 runs). Key speedups at medium profile (64ch, 20ep, 5 freq bands):

Metric	torch MPS vs numpy	torch CUDA vs numpy
PLV	86x	138x
Coh	133x	139x
ImCoh	111x	157x
ACCorr	40x	191x

Full benchmark data: Ramdam17/hypyp-sync-benchmarks

`optimization='auto'` — Benchmark-Driven Dispatch

The new AUTO_PRIORITY table selects the best GPU backend per metric and platform:

MPS: torch for einsum metrics, Metal for sign-based + ACCorr
CUDA: cuda_kernel first (pairwise, OOM-safe at 512+ channels), torch fallback

CUDA kernels are kept for all metrics as a safety net — torch.cuda OOMs at ≥512 channels due to large intermediate tensors, while custom kernels compute pairwise without materializing the full (E,F,C,C,T) tensor.

New `priority` Parameter

Custom backend ordering per-call:

from hypyp.analyses import compute_sync
con = compute_sync(signal, 'plv', optimization='auto', priority=['torch', 'cuda_kernel'])

New Optional Dependencies

metal = ["pyobjc-framework-Metal>=10.0"]
cupy = ["cupy-cuda12x>=13.0.0"]

Breaking Changes

None beyond what was already introduced in the hypyp.sync module refactor (ACCorr shape change).

Test Plan

83 tests (74 passed, 9 skipped — CUDA tests skip on non-NVIDIA machines)
All backends tested: numpy vs numba, numpy vs torch, numpy vs Metal, numpy vs CUDA
Tolerances: rtol=1e-9 for CPU/CUDA, rtol=1e-5 for MPS/Metal, rtol=1e-2 for sign-based on MPS
Graceful fallback: requesting unavailable backend → numpy with UserWarning
Auto-dispatch tests verify correct backend selection per platform
CI will run numpy + numba tests (no GPU in GitHub Actions)

🤖 Generated with Claude Code

Add hardware-accelerated backends to all connectivity metrics: - numba JIT (prange): PLV, CCorr, Coh, ImCoh, PLI, wPLI, EnvCorr, PowCorr - PyTorch MPS/CUDA/CPU (einsum): all 9 metrics - Metal compute shaders: PLI, wPLI, ACCorr (Apple Silicon) - CUDA raw kernels (CuPy): all 9 metrics (NVIDIA GPUs) Benchmark-driven AUTO_PRIORITY compiled from Mac M4 Max (131 runs) and Narval A100 (111 runs). The 'auto' optimization selects the best GPU backend per metric and platform: - MPS: torch for einsum metrics, Metal for sign-based + ACCorr - CUDA: cuda_kernel first (OOM-safe at 512ch), torch as fallback Add `priority` parameter on get_metric() and compute_sync() for custom backend ordering. New optional deps: pyobjc-framework-Metal, cupy-cuda12x. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

claude

Claude Code Review

This pull request is from a fork — automated review is disabled. A repository maintainer can comment @claude review to run a one-time review.

claude bot reviewed Apr 14, 2026

View reviewed changes

Ramdam17 merged commit cd0a11d into ppsp-team:master Apr 17, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(sync): GPU (torch/Metal/CUDA) and numba backends for all 9 metrics#264

feat(sync): GPU (torch/Metal/CUDA) and numba backends for all 9 metrics#264
Ramdam17 merged 1 commit intoppsp-team:masterfrom
Ramdam17:feat/gpu-numba-backends

Ramdam17 commented Apr 14, 2026

Uh oh!

claude bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Ramdam17 commented Apr 14, 2026

Summary

Backend Support Matrix

Benchmark Highlights

optimization='auto' — Benchmark-Driven Dispatch

New priority Parameter

New Optional Dependencies

Breaking Changes

Test Plan

Uh oh!

claude bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

`optimization='auto'` — Benchmark-Driven Dispatch

New `priority` Parameter