05-05-2026 18:00 via blog.google

Accelerating Gemma 4: faster inference with multi-token prediction drafters

An overview of how Multi-Token Prediction (MTP) drafters are making Gemma 4 models up to 3x faster at inference.
Read more »