Overview
- The model is available in preview through the Gemini API in Google AI Studio and for enterprises via Vertex AI, with early users including Latitude, Cartwheel and Whering.
- Pricing is set at $0.25 per 1 million input tokens and $1.50 per 1 million output tokens.
- Google reports a 2.5x faster time to first token and a 45% increase in output speed versus Gemini 2.5 Flash, citing the Artificial Analysis benchmark.
- Published results list an Arena.ai Elo score of 1432, 86.9% on GPQA Diamond and 76.8% on MMMU Pro.
- Developers can adjust built‑in thinking levels to balance latency, cost and depth for tasks such as translation, content moderation, UI generation and simulations.