diff --git a/demos/Main_Demo.ipynb b/demos/Main_Demo.ipynb index 87c01d86d..133f8eb55 100644 --- a/demos/Main_Demo.ipynb +++ b/demos/Main_Demo.ipynb @@ -737,15 +737,7 @@ "cell_type": "markdown", "metadata": {}, "source": [ - "TransformerLens supports 9,000+ models across 50+ architecture families, all of which can be loaded into a consistent(-ish) interface by just changing the name in `boot_transformers`. The available models are [documented here](https://transformerlensorg.github.io/TransformerLens/generated/transformer_bridge_models.html) with some notable ones [documented here](https://dynalist.io/d/n2ZWtnoYHrU1s4vnFSAQ519J#z=jHj79Pj58cgJKdq4t-ygK-4h), and a set of interpretability friendly models I've trained are [documented here](https://dynalist.io/d/n2ZWtnoYHrU1s4vnFSAQ519J#z=NCJ6zH_Okw_mUYAwGnMKsj2m), including a set of toy language models (tiny one to four layer models) and a set of [SoLU models](https://dynalist.io/d/n2ZWtnoYHrU1s4vnFSAQ519J#z=FZ5W6GGcy6OitPEaO733JLqf) up to GPT-2 Medium size (300M parameters)." - ] - }, - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "\n", - "Notably, this means that analysis can be near immediately re-run on a different model by just changing the name - to see this, let's load in DistilGPT-2 (a distilled version of GPT-2, with half as many layers) and copy the code from above to see the induction heads in that model." + "TransformerLens supports 9,000+ models across 50+ architecture families, all of which can be loaded into a consistent(-ish) interface by just changing the name in `boot_transformers`. The available models are [documented here](https://transformerlensorg.github.io/TransformerLens/generated/transformer_bridge_models.html). Analysis can be near immediately re-run on a different model by just changing the name - to see this, let's load in DistilGPT-2 (a distilled version of GPT-2, with half as many layers) and copy the code from above to see the induction heads in that model." ] }, {