Closed SourceOpenAIReleased 2024-09
o1
OpenAI's reasoning model that thinks before it answers - best for hard science and math
Context window
200K
Input / 1M tokens
$15.00
Output / 1M tokens
$60.00
Provider
OpenAI
o1 uses chain-of-thought reasoning internally before producing a response, spending extra compute on thinking rather than output tokens. This approach makes it dramatically better than GPT-4o on hard math, science, and complex coding problems. The trade-off is higher latency and cost - o1 is best used for problems where quality matters more than speed.
Strengths
- ✓Best-in-class math and physics
- ✓Strong competitive coding (Codeforces, HumanEval)
- ✓Scientific reasoning (GPQA top performer)
- ✓Multi-step logic and planning
- ✓200K context for long technical documents
Best for developers who...
Math and science problemsCompetitive programmingComplex multi-step reasoningResearch assistance
Benchmarks
| Benchmark | Score | Notes |
|---|---|---|
| GPQA Diamond | 78.3% | Expert-level science questions |
| HumanEval | 92.4% | Near-perfect coding |
| SWE-bench Verified | 48.9% | Strong software engineering |
Source: OpenAI o1 technical overview
Compare o1 with
o1 vs GPT-5
OpenAI - 128K ctx
o1 vs GPT-4o
OpenAI - 128K ctx
o1 vs Claude Opus 4.7
Anthropic - 200K ctx
o1 vs Gemini 2.5 Pro
Google DeepMind - 1M ctx