Skip to content
Open
Changes from all commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
7c1d141
updated loading in exploratory analysis demo to use transformer bridge
degenfabian Aug 18, 2025
c0e09ab
Merge remote-tracking branch 'origin/dev-3.x' into exploratory_analys…
bryce13950 Aug 20, 2025
6a7a104
Merge remote-tracking branch 'origin/dev-3.x' into exploratory_analys…
bryce13950 Aug 22, 2025
5fdaa42
Merge remote-tracking branch 'origin/dev-3.x' into exploratory_analys…
bryce13950 Aug 26, 2025
a26bc54
Merge remote-tracking branch 'origin/dev-3.x' into exploratory_analys…
bryce13950 Sep 4, 2025
e75a895
Merge remote-tracking branch 'origin/dev-3.x' into exploratory_analys…
bryce13950 Sep 5, 2025
00e5cb5
Merge remote-tracking branch 'origin/dev-3.x' into exploratory_analys…
bryce13950 Sep 6, 2025
f3b2d92
Merge remote-tracking branch 'origin/dev-3.x' into exploratory_analys…
bryce13950 Sep 7, 2025
875c7a8
Merge remote-tracking branch 'origin/dev-3.x' into exploratory_analys…
bryce13950 Sep 10, 2025
6a2cf40
Merge remote-tracking branch 'origin/dev-3.x' into exploratory_analys…
bryce13950 Sep 10, 2025
54d0d4f
Merge remote-tracking branch 'origin/dev-3.x' into exploratory_analys…
bryce13950 Sep 12, 2025
8881cb9
Merge remote-tracking branch 'origin/dev-3.x' into exploratory_analys…
bryce13950 Sep 12, 2025
baefa04
Merge remote-tracking branch 'origin/dev-3.x' into exploratory_analys…
bryce13950 Sep 12, 2025
cea930c
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 10, 2025
8f413c9
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 13, 2025
8685465
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 14, 2025
5c9a685
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 14, 2025
270bae7
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 15, 2025
9148919
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 15, 2025
b82ac59
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 15, 2025
f17d687
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 16, 2025
9e8ac56
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 16, 2025
dd14f50
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 16, 2025
219718a
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 16, 2025
db623eb
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 16, 2025
9a45cd6
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 17, 2025
223c382
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Oct 23, 2025
458f7a4
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Nov 12, 2025
f75368c
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Nov 12, 2025
c58a07c
Merge remote-tracking branch 'origin/dev-3.x-folding' into explorator…
bryce13950 Nov 12, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 9 additions & 7 deletions demos/Exploratory_Analysis_Demo.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@
},
{
"cell_type": "code",
"execution_count": 2,
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
Expand All @@ -118,7 +118,8 @@
"from jaxtyping import Float\n",
"\n",
"import transformer_lens.utils as utils\n",
"from transformer_lens import ActivationCache, HookedTransformer"
"from transformer_lens import ActivationCache\n",
"from transformer_lens.model_bridge import TransformerBridge"
]
},
{
Expand Down Expand Up @@ -245,12 +246,12 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The first step is to load in our model, GPT-2 Small, a 12 layer and 80M parameter transformer with `HookedTransformer.from_pretrained`. The various flags are simplifications that preserve the model's output but simplify its internals."
"The first step is to load in our model, GPT-2 Small, a 12 layer and 80M parameter transformer with `TransformerBridge.boot_transformers`. The various flags are simplifications that preserve the model's output but simplify its internals."
]
},
{
"cell_type": "code",
"execution_count": 5,
"execution_count": null,
"metadata": {},
"outputs": [
{
Expand All @@ -270,13 +271,14 @@
],
"source": [
"# NBVAL_IGNORE_OUTPUT\n",
"model = HookedTransformer.from_pretrained(\n",
" \"gpt2-small\",\n",
"model = TransformerBridge.boot_transformers(\n",
" \"gpt2\",\n",
" center_unembed=True,\n",
" center_writing_weights=True,\n",
" fold_ln=True,\n",
" refactor_factored_attn_matrices=True,\n",
")\n",
"model.enable_compatibility_mode()\n",
"\n",
"# Get the default device used\n",
"device: torch.device = utils.get_device()"
Expand Down Expand Up @@ -372,7 +374,7 @@
"\n",
"We want models that can take in arbitrary text, but models need to have a fixed vocabulary. So the solution is to define a vocabulary of **tokens** and to deterministically break up arbitrary text into tokens. Tokens are, essentially, subwords, and are determined by finding the most frequent substrings - this means that tokens vary a lot in length and frequency! \n",
"\n",
"Tokens are a *massive* headache and are one of the most annoying things about reverse engineering language models... Different names will be different numbers of tokens, different prompts will have the relevant tokens at different positions, different prompts will have different total numbers of tokens, etc. Language models often devote significant amounts of parameters in early layers to convert inputs from tokens to a more sensible internal format (and do the reverse in later layers). You really, really want to avoid needing to think about tokenization wherever possible when doing exploratory analysis (though, of course, it's relevant later when trying to flesh out your analysis and make it rigorous!). HookedTransformer comes with several helper methods to deal with tokens: `to_tokens, to_string, to_str_tokens, to_single_token, get_token_position`\n",
"Tokens are a *massive* headache and are one of the most annoying things about reverse engineering language models... Different names will be different numbers of tokens, different prompts will have the relevant tokens at different positions, different prompts will have different total numbers of tokens, etc. Language models often devote significant amounts of parameters in early layers to convert inputs from tokens to a more sensible internal format (and do the reverse in later layers). You really, really want to avoid needing to think about tokenization wherever possible when doing exploratory analysis (though, of course, it's relevant later when trying to flesh out your analysis and make it rigorous!). TransformerBridge comes with several helper methods to deal with tokens: `to_tokens, to_string, to_str_tokens, to_single_token, get_token_position`\n",
"\n",
"**Exercise:** I recommend using `model.to_str_tokens` to explore how the model tokenizes different strings. In particular, try adding or removing spaces at the start, or changing capitalization - these change tokenization!</details>"
]
Expand Down
Loading