Gemma 4 E2B emits parenthetical "word(suffix)" continuations under lower-temperature sampling

## Observation

Gemma 4 E2B (instruction-tuned) appears to have a learned **"word(suffix)" parenthetical continuation style** — the model frequently emits constructions like `offline(text)` / `download(option)` / `feature(name)`, where a noun is immediately followed by a parenthesized clarifier word/phrase, in contexts where this format is not stylistically warranted.

Counter-intuitively, the pattern is **more frequent at lower temperatures**, suggesting it's a high-probability mode of the conditional distribution rather than a noise artifact. At Google's recommended `T=1.0, topP=0.95, topK=64`, the pattern is occasional. At `T=0.4`, it dominates body-text generation in our eval.

## Why this matters

The expectation around lower temperature is "tighter, more deterministic, more on-task." For Gemma 4 E2B, lower temperature instead **amplifies** a specific learned-style attractor. This blocks one of the standard mitigations for a separate Gemma 4 decoding pathology — token repetition / corruption in tool arguments — that we and others (google-deepmind/gemma#622, vllm-project/vllm#40080) have documented.

Concretely: we'd lower temperature for tool-arg segments to reduce the `mayaya`/`dzogogchen`/`202222`-family corruption. But lowering it surfaces the parenthetical pattern in body text. We're stuck at T=1.0.

## Reproduction

- Model: Gemma 4 E2B-it
- Runtime: LiteRT-LM 0.10.2 (also reproducible in vLLM per separate testing)
- Device: Pixel 10a, Mali-G715 GPU
- Sampler: `topK=64, topP=0.95, T=0.4` (degraded), `T=1.0` (acceptable but still occasional)
- Prompt class: general chat / explanatory turns. Most reproducible when the user asks "what is X" for technical X.

## Related observations elsewhere

Closest published thing is google-deepmind/gemma#581 (degenerate embeddings producing fixed attractor states for rare-script tokens), and arXiv 2509.09715 on "symbolic triggers" of hallucination in Gemma. Neither addresses the parenthetical-suffix pattern specifically.

## Ask

1. Acknowledgement that this is a known mode of Gemma 4 E2B's training distribution.
2. Guidance on whether instruction-tuned fine-tunes or future Gemma 4 revisions have addressed it.
3. Any data-side analysis that would help model consumers steer around it (preferred sampling profiles, system-prompt phrasing, etc.).

Happy to provide reproduction prompts + sample outputs.

## References

- google-ai-edge/LiteRT-LM#2202 — full field report on Gemma 4 E2B on-device behaviors
- google-ai-edge/LiteRT-LM#2249 — feature request for `repetitionPenalty` / per-segment sampler
- google-deepmind/gemma#622 — token repetition collapse on larger variants
- google-deepmind/gemma#581 — cuneiform attractor state in Gemma 3


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gemma 4 E2B emits parenthetical "word(suffix)" continuations under lower-temperature sampling #647

Observation

Why this matters

Reproduction

Related observations elsewhere

Ask

References

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Gemma 4 E2B emits parenthetical "word(suffix)" continuations under lower-temperature sampling #647

Description

Observation

Why this matters

Reproduction

Related observations elsewhere

Ask

References

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions