Closed SourceGoogle DeepMindReleased 2025-05
Gemini 2.5 Flash
Google's fastest and cheapest model with a 1M context - hard to beat on price/performance
Context window
1M
Input / 1M tokens
$0.075
Output / 1M tokens
$0.30
Provider
Google DeepMind
Gemini 2.5 Flash delivers a remarkable price-to-performance ratio: 1M context window, multimodal input, and strong reasoning at $0.075/1M input tokens. It's 16x cheaper than Gemini 2.5 Pro on input while retaining most capabilities for typical tasks. The go-to choice when you need Gemini's long context at scale.
Strengths
- ✓Exceptional price-to-performance ratio
- ✓1M context at near-commodity pricing
- ✓Multimodal support at low cost
- ✓Fast inference latency
- ✓Strong summarization and classification
Best for developers who...
High-volume, long-context tasksCost-sensitive production workloadsDocument and media summarizationRetrieval-augmented pipelines
Benchmarks
| Benchmark | Score | Notes |
|---|---|---|
| MMLU | ~89% | Strong for the price tier |
| HumanEval | ~85% | Solid coding |
Source: Google DeepMind Gemini 2.5 Flash
Compare Gemini 2.5 Flash with
Gemini 2.5 Flash vs Gemini 2.5 Pro
Google DeepMind - 1M ctx
Gemini 2.5 Flash vs Claude Haiku 4.5
Anthropic - 200K ctx
Gemini 2.5 Flash vs GPT-4o
OpenAI - 128K ctx