Refactor tutorial to use dataclass for configuration #119

Aatman09 · 2026-01-04T12:52:35Z

Resolves #107

Reference
This implementation is based on the following tutorial:
JAX Machine Translation Tutorial

Changes made

Added dataclass-based configuration for improved clarity and structure
Enhanced the tutorial with additional Markdown explanations for better readability

Notes

Key–value (KV) caching has been left out

Checklist

I have read the Contribution Guidelines and used pre-commit hooks to format this commit.
I have added all the necessary unit tests for my change. (run_model.py for model usage, test_outputs.py and/or model_validation_colab.ipynb for quality).
(If using an LLM) I have carefully reviewed and removed all superfluous comments or unneeded, commented-out code. Only necessary and functional code remains.
I have signed the Contributor License Agreement (CLA).

chapman20j · 2026-01-08T00:44:50Z

Hi @Aatman09 . Thank you for the nice commit. Could you please include a few pip installs at the beginning of the notebook for additional dependencies. Please also include their versions. e.g. ! pip install "grain==0.2.15. Also, please ensure that this notebook runs on colab.

chapman20j · 2026-01-08T00:51:16Z

For the KV cache, this would be nice to add in the Use Model For Inference section. Using caching makes the inference faster by allowing attention to re-use the previously computed k and v tensors. This gives you two options

Implement your own caching logic
Change the flags for the attention layers

I think option 2 makes the most sense for this tutorial so it doesn't get too in the weeds on the cache. Implementing your own caching may also require writing your own attention layers. For more details, the nnx docs cover how to initialize a cache (https://flax.readthedocs.io/en/latest/api_reference/flax.nnx/nn/attention.html). This can be done with .init_cache or the .set_mode methods. Please let me know if you'd like any further clarification or more discussion around this.

Aatman09 · 2026-01-08T01:44:04Z

Thank you for the review I will implement the changes as soon as possible

Refactor tutorial to use dataclass for configuration

7e947c2

Merge branch 'main' into tutorials

1b74b9d

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Refactor tutorial to use dataclass for configuration #119

Refactor tutorial to use dataclass for configuration #119

Aatman09 commented Jan 4, 2026 •

edited

Loading

Uh oh!

chapman20j commented Jan 8, 2026

Uh oh!

chapman20j commented Jan 8, 2026

Uh oh!

Aatman09 commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Refactor tutorial to use dataclass for configuration #119

Are you sure you want to change the base?

Refactor tutorial to use dataclass for configuration #119

Conversation

Aatman09 commented Jan 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes made

Notes

Uh oh!

chapman20j commented Jan 8, 2026

Uh oh!

chapman20j commented Jan 8, 2026

Uh oh!

Aatman09 commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Aatman09 commented Jan 4, 2026 •

edited

Loading