Partha Pratim Saha pps121

🧭 Research Vision

Can the geometry of thought reveal how alignment works?

I develop differential-geometric frameworks for understanding how large language models encode, transform, and ultimately suppress beliefs — with a focus on mechanistic interpretability and AI alignment. The Torsional Belief Vector Field (TBVF), models transformer hidden states as discrete curves on a Riemannian belief manifold equipped with a Cartan torsion connection. This reveals, for the first time, where and how DPO/RLHF alignment geometrically reshapes model internals — creating what I call "brake layers": localized, geometrically distinct suppression mechanisms.

I am actively seeking fully-funded PhD positions at world-class research universities in mechanistic interpretability, geometric deep learning, and AI alignment.

🔬 Torsional Belief Vector Fields

Torsional Belief Vector Field treats each transformer layer's hidden state as a point on a high-dimensional Riemannian manifold with Fisher-Rao metric. The torsion tensor — antisymmetric component of cross-layer covariance — measures rotational mismatch between consecutive belief updates.

Layer 27 (Peak Brake Layer):
━━━━━━━━━━━━━━━━━━━━━━━━━━━
SFT torsion norm:  ████████████ 0.66
DPO torsion norm:  ██████       0.42
Cohen's d:         0.741 ***
p-value:           7.7 × 10⁻¹³
━━━━━━━━━━━━━━━━━━━━━━━━━━━━
"Brake layers are geometry."

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partha Pratim Saha pps121

Achievements

Achievements

Highlights

Block or report pps121

🧭 Research Vision

🔬 Torsional Belief Vector Fields

Pinned Loading

Uh oh!