/drops

This drop

Google

Gemma 4

TypeTooling

ShippedMay 7, 2026

Recent drops

CursorCursor 2.2 ships Debug Mode, browser…

MetaMuse Spark ships with parallel Conte…

OpenAIOpenAI deprecates the legacy Chat Co…

OpenAIOpenAI ships three voice API models:…

AnthropicAnthropic ships ten Claude finance a…

theme

sources·policy·rss

© 2026 straced

Drops · Google

Gemma 4 multi-token prediction drafters speed up inference via speculative decoding

Google shipped multi-token prediction drafters for Gemma 4 — speculative decoding with shared KV cache; Google reports roughly 2x tokens/sec for the 26B model on an RTX PRO 6000 and ~2.2x on Apple Silicon edge models.

ProductGemma 4TypeToolingShippedMay 7, 2026

Read at Google →

More from Drops

affects: inference, open-source