Description
Gemma4Tokenizer does not define FORBIDDEN_TOKENS, unlike Gemma3Tokenizer and Gemma3nTokenizer which both forbid multimodal placeholder tokens from being generated during sampling.
This means when sampling with a Gemma 4 model in text-only mode, the sampler has no restriction on generating raw image/audio placeholder tokens (<|image|>, <|image>, <image|>, <|audio|>, <|audio>, <audio|>), which would produce corrupted output.
Comparison
# Gemma3Tokenizer (line ~440) — correctly forbids image tokens:
FORBIDDEN_TOKENS = (
special_tokens.START_OF_IMAGE,
special_tokens.END_OF_IMAGE,
)
# Gemma3nTokenizer (line ~465) — same:
FORBIDDEN_TOKENS = (
special_tokens.START_OF_IMAGE,
special_tokens.END_OF_IMAGE,
)
# Gemma4Tokenizer (line ~475) — MISSING, inherits empty tuple from base:
# (no FORBIDDEN_TOKENS defined)
How it's used
In gemma/gm/text/_sampler.py:501:
forbidden_tokens += self.tokenizer.FORBIDDEN_TOKENS
For Gemma4, this adds nothing, so multimodal tokens are never masked out.
Proposed Fix
Add FORBIDDEN_TOKENS to Gemma4Tokenizer covering all multimodal placeholder tokens:
class Gemma4Tokenizer(Tokenizer):
...
FORBIDDEN_TOKENS = (
special_tokens.IMAGE_PLACEHOLDER,
special_tokens.START_OF_IMAGE,
special_tokens.END_OF_IMAGE,
special_tokens.AUDIO_PLACEHOLDER,
special_tokens.START_OF_AUDIO,
special_tokens.END_OF_AUDIO,
)
Location
- File:
gemma/gm/text/_tokenizer.py
- Class:
Gemma4Tokenizer (around line 475)
Description
Gemma4Tokenizerdoes not defineFORBIDDEN_TOKENS, unlikeGemma3TokenizerandGemma3nTokenizerwhich both forbid multimodal placeholder tokens from being generated during sampling.This means when sampling with a Gemma 4 model in text-only mode, the sampler has no restriction on generating raw image/audio placeholder tokens (
<|image|>,<|image>,<image|>,<|audio|>,<|audio>,<audio|>), which would produce corrupted output.Comparison
How it's used
In
gemma/gm/text/_sampler.py:501:For Gemma4, this adds nothing, so multimodal tokens are never masked out.
Proposed Fix
Add
FORBIDDEN_TOKENStoGemma4Tokenizercovering all multimodal placeholder tokens:Location
gemma/gm/text/_tokenizer.pyGemma4Tokenizer(around line 475)