Gemma3 initial commit #102

chapman20j · 2025-12-10T01:12:25Z

Resolves #100
Adds the Gemma3 model to the Bonsai Repo. This first commit is just a working version, but I am still working on optimizing it.

Reference
Refer to Issue #100

Checklist

I have read the Contribution Guidelines and used pre-commit hooks to format this commit.
I have added all the necessary unit tests for my change. (run_model.py for model usage, test_outputs.py and/or model_validation_colab.ipynb for quality).
(If using an LLM) I have carefully reviewed and removed all superfluous comments or unneeded, commented-out code. Only necessary and functional code remains.
I have signed the Contributor License Agreement (CLA).

Updated configs Moved embed_tokens to more natural place Updated run_model to use sampler and stop at end_of_turn token Added test_sharding_gemma3 Added batched forward test. Need more complex behavior and testing

bonsai/models/gemma3/modeling.py

bonsai/models/gemma3/tests/test_outputs_gemma3.py

jenriver · 2025-12-18T01:01:54Z

Also, please make sure your selective tests are passing

bonsai/models/gemma3/tests/run_model.py

bonsai/models/gemma3/README.md

bonsai/models/gemma3/modeling.py

jenriver · 2026-01-06T01:33:44Z

bonsai/models/gemma3/modeling.py

+def init_cache(
+    cfg: ModelConfig, batch_size: int, token_len: int, generate_steps: int, dtype: jnp.dtype = jnp.bfloat16
+) -> Cache:
+    cache_size = 2 ** math.ceil(math.log2(max(token_len + generate_steps, 1)))  # Pad for a sharding-friendly size.
+    return [
+        LayerCache(cfg.text_config, batch_size, cache_size, dtype) for _ in range(cfg.text_config.num_hidden_layers)
+    ]


Currently this is a global KV cache implementation and doesn't introduce kv cache mem reduction benefits in gemma 3 -

Could we have something like this to account for local vs. global?

def init_cache(...) -> Cache: full_cache_size = 2 ** math.ceil(math.log2(max(token_len + generate_steps, 1))) window_size = cfg.text_config.sliding_window # Typically 1024 caches = [] for i, layer_type in enumerate(cfg.text_config.layer_types): size = full_cache_size if layer_type == AttentionMode.FULL else window_size caches.append(LayerCache(cfg.text_config, batch_size, size, dtype)) return caches

We can also fix the k / v cache update parts to be based on window_size modulo.

bonsai/models/gemma3/modeling.py

bonsai/models/gemma3/tests/test_outputs_gemma3.py

bonsai/models/gemma3/tests/run_model.py

chapman20j added 4 commits December 9, 2025 17:10

Gemma3 initial commit

5f7d603

kv cache

9434fbe

Added partial support for sharding

3ddf0b0

Updated configs Moved embed_tokens to more natural place Updated run_model to use sampler and stop at end_of_turn token Added test_sharding_gemma3 Added batched forward test. Need more complex behavior and testing

sharding

214bc75

jenriver reviewed Dec 17, 2025

View reviewed changes

bonsai/models/gemma3/modeling.py Outdated Show resolved Hide resolved

jenriver reviewed Dec 17, 2025

View reviewed changes

bonsai/models/gemma3/modeling.py Outdated Show resolved Hide resolved

jenriver reviewed Dec 17, 2025

View reviewed changes

bonsai/models/gemma3/tests/test_outputs_gemma3.py Outdated Show resolved Hide resolved

chapman20j added 4 commits December 22, 2025 08:10

access token check for testing

92e690a

hf token test

6494de0

test sharding updates

40bd528

test updates

ab31f86

jenriver reviewed Dec 22, 2025

View reviewed changes

bonsai/models/gemma3/tests/run_model.py Outdated Show resolved Hide resolved

jenriver and others added 3 commits December 22, 2025 15:38

Merge branch 'main' into gemma3

4f842f8

updated make input

2aa7891

Merge branch 'main' into gemma3

eec2a85

jenriver reviewed Jan 6, 2026

View reviewed changes

chapman20j and others added 8 commits January 6, 2026 16:04

responding to comments

e3a5aae

put back check_hf_token

eb755a9

test sharding access token

fc60566

separate prefill and decode

f1d92b1

update readme

3b3ab07

Update remaining tasks in README

4395971

Update section headers

1894e5f

Merge branch 'main' into gemma3

c9f1b0f

jenriver previously approved these changes Jan 8, 2026

View reviewed changes

ruff cleanup

1207450

jenriver dismissed their stale review via 1207450 January 8, 2026 00:46

jenriver approved these changes Jan 8, 2026

View reviewed changes

jenriver merged commit 36896a5 into jax-ml:main Jan 8, 2026
4 of 5 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Gemma3 initial commit #102

Gemma3 initial commit #102

Uh oh!

chapman20j commented Dec 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jenriver commented Dec 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jenriver Jan 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Gemma3 initial commit #102

Gemma3 initial commit #102

Uh oh!

Conversation

chapman20j commented Dec 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jenriver commented Dec 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jenriver Jan 6, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants