Closed SourceOpenAIReleased 2024-05
GPT-4o
OpenAI's multimodal workhorse - fast, affordable, and widely integrated
Context window
128K
Input / 1M tokens
$2.50
Output / 1M tokens
$10.00
Provider
OpenAI
GPT-4o ("omni") combines text, image, and audio understanding in a single model endpoint. It matches GPT-4 Turbo quality at 50% of the cost and significantly faster latency. The default model powering ChatGPT, it's the most widely deployed frontier model in the ecosystem.
Strengths
- ✓Native multimodal input (text, image, audio)
- ✓Fast response times at this capability level
- ✓Strong on structured data and JSON output
- ✓Best ecosystem support across SDKs and tools
- ✓Real-time audio capabilities
Best for developers who...
Multimodal appsHigh-volume production useChatbots and assistantsStructured data extraction
Benchmarks
| Benchmark | Score | Notes |
|---|---|---|
| MMLU | 88.7% | Matches GPT-4 Turbo |
| HumanEval | 90.2% | Strong Python coding |
| GPQA | 53.6% | Graduate-level science |
Source: OpenAI GPT-4o launch post
Tools powered by GPT-4o
Compare GPT-4o with
GPT-4o vs GPT-5
OpenAI - 128K ctx
GPT-4o vs o1
OpenAI - 200K ctx
GPT-4o vs Claude Sonnet 4.6
Anthropic - 200K ctx
GPT-4o vs Gemini 2.5 Flash
Google DeepMind - 1M ctx