Skip to content

Conversation

@taziksh
Copy link

@taziksh taziksh commented Oct 12, 2025

Description

This PR builds on #816 and resolves all remaining type checking issues, allowing the OLMo implementation to pass mypy validation.

Changes:

  • Added type assertions for OLMo2/OLMoE decoder layers to resolve union-attr errors
  • Bumped minimum Python version to 3.10 for jaxtyping mypy plugin compatibility (Python 3.8 reached EOL Oct 2024)
  • Fixed jaxtyping dimension annotation in create_alibi_multipliers (head_idx → n_heads)

Type of change

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Checklist:

  • I have commented my code, particularly in hard-to-understand areas
  • My changes generate no new warnings
  • New and existing unit tests pass locally with my changes

@taziksh taziksh mentioned this pull request Oct 12, 2025
6 tasks
@taziksh
Copy link
Author

taziksh commented Oct 15, 2025

@jonasrohw this is my PR with the type fixes

@taziksh
Copy link
Author

taziksh commented Oct 24, 2025

Not sure who to tag to have this PR reviewed! @bryce13950 any pointers would be appreciated

@taziksh
Copy link
Author

taziksh commented Oct 31, 2025

Bumping this again for visibility. OLMo 2’s open-source training data makes it a great fit for TransformerLens and could enable some interesting experiments. Would appreciate a review when possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants