Fix benchmark hooks cleanup by danra · Pull Request #1425 · TransformerLensOrg/TransformerLens

danra · 2026-06-21T00:48:16Z

Description

HookPoint.add_hook returns None, but the benchmark helpers stored its return value as a "handle" (silencing mypy with behind an if handle is not None guard). Since the value was always None, the guard was always false and the capture hooks were never removed.

Track the HookPoint that was registered on and clean up via hook_point.remove_hooks() (dir="bwd" for backward hooks), matching the existing idiom in TransformerBridge.run_with_cache.

Type of change

Bug fix (non-breaking change which fixes an issue)

Checklist:

My changes generate no new warnings
New and existing unit tests pass locally with my changes
I have not rewritten tests relating to key interfaces which would affect backward compatibility

Fix broken link in README

@jlarson4

…ensOrg#1316) * Add Direct Logit Attribution tool for TransformerBridge * Resolve review feedback and add Direct Logit Attribution tests Resolved review feedback from @jlarson4, added tests covering reconstruction invariants on a distilgpt2 bridge in compatibility mode, arguments, asserting sum(scores) == logit_diff - (b_U[correct] - b_U[wrong]) against the model's real logits, plus labels/shape and batch-averaging checks. Added additional hardening: - Fix a latent direction-shape bug: replace the fragile answer_tokens.numel()==1 branch with a robust reshape so single-prompt, single-token inputs are handled correctly - Detect hybrid blocks via bridge.layer_types() instead of substring matching named_modules(), the codebase's own semantic mechanism - Import get_act_name from transformer_lens.utilities to avoid the transformer_lens.utils DeprecationWarning; drop the invalid return_type kwarg to run_with_cache - Register the analysis subpackage in tools/__init__.py Closes TransformerLensOrg#1263.

…merLensOrg#1369) * Add Direct Logit Attribution tool (TransformerLensOrg#1263) Add transformer_lens/tools/analysis/direct_logit_attribution.py, a single-call DLA analysis that decomposes a logit (or logit difference) into per-component, per-layer (logit-lens), or per-head contributions. Wraps the existing ActivationCache primitives (decompose_resid / accumulated_resid / stack_head_results / logit_attrs) and works with both HookedTransformer and TransformerBridge, since they share the cache API. Returns a DirectLogitAttribution dataclass (attribution tensor + aligned labels, plus a top(k) helper). Adds integration tests asserting the exact DLA correctness invariant on both systems: the complete decomposition reconstructs the model's real logit up to the unembedding bias b_U. Closes TransformerLensOrg#1263 * Resolving conflicts between 1316 and 1369 * format fixes --------- Co-authored-by: Azra Bano <azrabano23@gmail.com> Co-authored-by: Jonah Larson <jonahalarson@comcast.net>

…enerate (TransformerLensOrg#1374)

…rmerLensOrg#1373)

* Add Phi adapter tests * Add comment about setup component test * Delete redundant config literal tests

* Fixed SVD interpreter test * Format SVD interpreter fixture test

The Restricted Loss section called loss_fn(all_logits, labels), but all_logits had been rearranged earlier into a (p, p, d_vocab) grid for the logit periodicity analysis. loss_fn's 3-D branch assumes (batch, pos, d_vocab) and takes logits[:, -1], producing a (p, p) tensor that crashes the gather against the p*p labels (TransformerLensOrg#543). Use original_logits instead, which is recomputed just above and is the same full-dataset loss the cell intends to print. Also clear the stored RuntimeError output from the cell.

Breaking: removes the public eps_attr constructor argument and the config.eps_attr attribute. The field was never read (its consumer was deleted when NormalizationBridge moved to direct HF delegation), so no model behavior changes, but it is an API removal.

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* Add Olmo2 architecture adapter tests * Drop test_attn_output_shape per the unit-test guide (shared bridge contract)

* Fixed SVD interpreter test * Format SVD interpreter fixture test * Add qwen adapter unit test * Retrigger CI (unrelated HF 429 error failing the build)

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

…Org#1384)

…ransformerLensOrg#1390) The adapter already conditionally omitted ln2 from the block submodules when use_parallel_residual=True, but still wrapped them in a plain BlockBridge, which rejects the attn+mlp+no-ln2 shape. Switched to conditional block_cls (ParallelBlockBridge for the parallel branch, BlockBridge for sequential), mirroring the dual-mode pattern in falcon.py.

26 tests covering: component mapping (slots, bridge types, HF paths, submodule structure), anti-drift config flags (final_rms, uses_rms_norm, gated_mlp), weight conversion key set and rearrange patterns, GQA propagation to K/V only, and setup_component_testing rotary-emb wiring. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

* fix: Fix broken line graphs - Fixes incorrect induction loss graph at end of notebook (values were incorrect, loss appeared to be going up with training instead of down!) - Fixes x/y axes showing "index"/"value" instead of configured names * fix: broken/outdated links * doc: Reference ARENA chapter directly instead of an older equivalent that forwards there * chore: Align text in code with text in markdown * doc: Update texts on supported models and architectures - Updated counts - Updated link to list of all models supported on v3 - "HookedTransformed.from_pretrained" instances -> "TransformerBridge.boot_transformers", the up-to-date recommended method which makes the wider variety accessible - "consistent(-ish) architecture" -> "consistent(-ish) interface": with v3, the consistent interface proxies the non-consistent underlying architectures - "Transformer architecture" title -> "HookedTransformer architecture", mention deprecation - Mention overview of models last updated in 2023 - Drop stale reference to phantom "table for hyper-parameters" * doc: Add note on boot_transformers incompatibility with checkpoints * doc: Document purpose of enabling compatibility mode * doc: Drop minor comment on default_prepend_bos, only relevant for legacy HookedTransformer * doc: Remove stale gotcha, hooks are now properly removed even in case of an error in a hook * fix: Mismatching values in code vs. descriptions * chore: Replace deprecated circuitvis attention visualization with newer one * chore: Replace deprecated model names aliases * chore: Remove deprecated prepend_bos argument * chore: Update hook name to v3 * chore: Full OV graph title no longer the same as the preceding very similar OV graph * doc: Clearer spaces in tokens * doc: Bracket tokens styled as code to prevent Colab from collapsing spaces Previously Colab showed both tokens as identical, squashing the space * chore: Try the model out right after calls to do so * doc: approximate number can be approximate (consistent with another preceding one) * chore: Remove installing old node version which was the newest at the time Now actually gives a deprecation warning and waits for 10 seconds * chore: Remove leftover cell * chore: fix bad closing tag * doc: fix typos * doc: More precise wording, setting a value, not adding it * doc: move floating function names into complete sentence

25 tests across 4 classes covering component mapping, config flags, weight conversions, and GQA head-count propagation. - TestMistralComponentMapping (12 tests): top-level keys, bridge types, HF module paths, block submodules, attn flags, QKVO paths, MLP paths. Includes explicit guard that attn uses AttentionBridge, not PositionEmbeddingsAttentionBridge. - TestMistralAdapterConfig (4 tests): final_rms=False, uses_rms_norm, gated_mlp, attn_only — anti-drift flags. - TestMistralWeightConversions (5 tests): exactly 4 QKVO weight keys, split-heads and merge-heads rearrange patterns, no bias/norm entries. - TestMistralGQASupport (4 tests): K/V use n_key_value_heads, Q/O unchanged, fallback to n_heads when n_key_value_heads is unset.

…properly (TransformerLensOrg#1394)

…formerLensOrg#1397)

…ansformerLensOrg#1399) Adds focused test suites for three architecture adapters per the proposal in issue TransformerLensOrg#1302. tests/unit/model_bridge/supported_architectures/test_phi3_adapter.py - Component mapping structure (bridge types and HF module paths) - Weight conversion key set and source keys for fused qkv/gate_up - _SizedSplitConversion numerical correctness (Q/K/V GQA splits) - Config flags (RMS norm, rotary, gated MLP, supports_fold_ln=False) - preprocess_weights LN folding into QKV and gate/up projections tests/unit/model_bridge/supported_architectures/test_granite_adapter.py - Component mapping for dense GraniteArchitectureAdapter - Weight conversion key set (standard QKVO rearrangements) - Config flags (final_rms=True, default_prepend_bos=False, GQA heads) - GraniteMoeArchitectureAdapter: MoE bridge replaces dense MLP, all other components and config flags match dense Granite Closes part of TransformerLensOrg#1302.

* chore: Plot helper allows customizing graph before showing it * feat: Direct path patching in exploratory analysis demo, resolves TransformerLensOrg#111 * doc: fix head index in prose

… handle HookPoint.add_hook returns None, but the benchmark helpers stored its return value as a "handle" (silencing mypy with behind an `if handle is not None` guard). Since the value was always None, the guard was always false and the capture hooks were never removed. Track the HookPoint that was registered on and clean up via hook_point.remove_hooks() (dir="bwd" for backward hooks), matching the existing idiom in TransformerBridge.run_with_cache. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

danra and others added 28 commits June 8, 2026 09:14

Merge pull request TransformerLensOrg#1370 from danra/patch-1

9deb6bf

Fix broken link in README

Merge remote-tracking branch 'origin/main' into dev

75095b1

Add stop_strings and stopping_criteria support to TransformerBridge.g…

a5f1193

…enerate (TransformerLensOrg#1374)

Remove extra checks from Phi adapter setup_component_testing (Transfo…

d37642d

…rmerLensOrg#1373)

Add phi tests (TransformerLensOrg#1372)

35ab438

* Add Phi adapter tests * Add comment about setup component test * Delete redundant config literal tests

Fixed SVD interpreter test (TransformerLensOrg#1375)

34dc38a

* Fixed SVD interpreter test * Format SVD interpreter fixture test

Fix typos and narrow a bare except (TransformerLensOrg#1380)

036a861

Add unit tests for NeoArchitectureAdapter (TransformerLensOrg#1381)

8c395ee

Add unit tests for NeoxArchitectureAdapter (TransformerLensOrg#1382)

d6896df

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

Add Olmo2 architecture adapter tests (TransformerLensOrg#1387)

e49f78c

* Add Olmo2 architecture adapter tests * Drop test_attn_output_shape per the unit-test guide (shared bridge contract)

Add Qwen Adapter unit tests (TransformerLensOrg#1388)

d603a1d

* Fixed SVD interpreter test * Format SVD interpreter fixture test * Add qwen adapter unit test * Retrigger CI (unrelated HF 429 error failing the build)

Add unit tests for OpenElmArchitectureAdapter (TransformerLensOrg#1383)

5962cd2

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

Add unit tests for LlavaOnevisionArchitectureAdapter (TransformerLens…

b682377

…Org#1384)

Add StableLM architecture adapter tests (TransformerLensOrg#1393)

09f2eba

Remove torch cap, so that newer versions of python can still resolve …

56c3d91

…properly (TransformerLensOrg#1394)

Updating Agentic Workflows (TransformerLensOrg#1395)

6a17449

Drop round-trip and output-shape tests per the unit-test guide (Trans…

8691fa3

…formerLensOrg#1397)

Direct path patch demo (TransformerLensOrg#1398)

cae9d46

* chore: Plot helper allows customizing graph before showing it * feat: Direct path patching in exploratory analysis demo, resolves TransformerLensOrg#111 * doc: fix head index in prose

danra closed this Jun 21, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix benchmark hooks cleanup#1425

Fix benchmark hooks cleanup#1425
danra wants to merge 28 commits into
TransformerLensOrg:mainfrom
danra:fix_benchmark_hooks_cleanup

danra commented Jun 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants

Conversation

danra commented Jun 21, 2026

Description

Type of change

Checklist:

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

10 participants