For Developers/Models/Gemini 2.5 Flash
Closed SourceGoogle DeepMindReleased 2025-05

Gemini 2.5 Flash

Google's fastest and cheapest model with a 1M context - hard to beat on price/performance

Context window

1M

Input / 1M tokens

$0.075

Output / 1M tokens

$0.30

Provider

Google DeepMind

Gemini 2.5 Flash delivers a remarkable price-to-performance ratio: 1M context window, multimodal input, and strong reasoning at $0.075/1M input tokens. It's 16x cheaper than Gemini 2.5 Pro on input while retaining most capabilities for typical tasks. The go-to choice when you need Gemini's long context at scale.

Strengths

  • Exceptional price-to-performance ratio
  • 1M context at near-commodity pricing
  • Multimodal support at low cost
  • Fast inference latency
  • Strong summarization and classification

Best for developers who...

High-volume, long-context tasksCost-sensitive production workloadsDocument and media summarizationRetrieval-augmented pipelines

Benchmarks

BenchmarkScoreNotes
MMLU~89%Strong for the price tier
HumanEval~85%Solid coding

Source: Google DeepMind Gemini 2.5 Flash

Compare Gemini 2.5 Flash with

All model comparisons →

Learn the concepts