Skip to content

Direct path patch demo#1398

Merged
jlarson4 merged 3 commits into
TransformerLensOrg:devfrom
danra:direct_path_patch_demo
Jun 19, 2026
Merged

Direct path patch demo#1398
jlarson4 merged 3 commits into
TransformerLensOrg:devfrom
danra:direct_path_patch_demo

Conversation

@danra

@danra danra commented Jun 18, 2026

Copy link
Copy Markdown
Contributor

Description

Adds direct path patching demonstration to the Exploratory Analysis notebook, resolving #111

(as it so happens, an additional PR #1396 had just been opened to add direct path patching to transformer_lens - apparently I was not the only one working on this! The other PR adds a new demo rather than amending the existing Exploratory Analysis one, and adds the feature to transformer_lens. I think my approach of amending the existing notebook demonstrates the technique better.)

Type of change

Please delete options that are not relevant.

  • New feature (non-breaking change which adds functionality)

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes
  • I have not rewritten tests relating to key interfaces which would affect backward compatibility

@jlarson4

jlarson4 commented Jun 18, 2026

Copy link
Copy Markdown
Collaborator

Hi @danra! Thanks for putting this together. I do want to keep your demo as part of the Exploratory Analysis demo as well as @mukund1985's specific methods & unit testing. There are things both of you have done well that I want to make sure are included in TransformerLens.

My intention is to merge #1396 first, and have you rebase to the latest version of dev at that point. This way the methods established in #1396 can be used to run your section in exploratory analysis. You can then point people to the Direct Path Patching specific demo at the end of the section as a point additional examples.

@danra

danra commented Jun 18, 2026

Copy link
Copy Markdown
Contributor Author

@jlarson4 That wouldn't match the existing structure of the demos: Exploratory Analysis demonstrates the conceptual activation patching technique, showing how to do it from scratch. It makes a point of not using the existing utilities, and instead refers to the Activation Patching one for that purpose:

This section explains how to do activation patching conceptually by implementing it from scratch. To use it in practice with TransformerLens, see this demonstration instead

Consistent with that approach, I would keep Exploratory Analysis demonstrating direct path patching from first-principles, and add a similar reference to a demo for using library's facilities. That could be a new section in the existing notebook (which could change its name from "Activation Patching" to "Patching Models", or some other, wider-scope name) or a new notebook (which #1396 already has).

@jlarson4

Copy link
Copy Markdown
Collaborator

You are absolutely correct @danra. I was looking at your edits in isolation and not thinking about what exploratory analysis actually is. I am rerunning the CI tests that failed due to 492s, assuming those pass I will get this merged. Thank you for your feedback, I appreciate it!

@jlarson4 jlarson4 merged commit cae9d46 into TransformerLensOrg:dev Jun 19, 2026
48 of 50 checks passed
@danra danra deleted the direct_path_patch_demo branch June 19, 2026 05:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants