GPT-4o

OpenAI's versatile, fast multimodal workhorse (text + image)

Context window

128,000 tokens (16,384 max output)

Input / 1M tokens

$2.50

Output / 1M tokens

$10.00

Provider

OpenAI

Cached input $1.25 per 1M. Pricing reflects the gpt-4o-2024-08-06 and later snapshots. Older snapshots are deprecated; the current default remains available. · Data verified 2026-07-02

GPT-4o ('omni') is OpenAI's multimodal model launched May 13, 2024, designed for fast, cost-effective general use. It accepts text and image input and outputs text, with a 128,000-token context window and up to 16,384 output tokens, and a knowledge cutoff of October 1, 2023. Priced at $2.50 input / $10 output per 1M, it delivers solid results across knowledge (MMLU 88.7), coding (HumanEval 90.2), and math while being much faster and cheaper than the frontier GPT-5 generation. It remains a widely used default for everyday tasks, though newer GPT-5-series models exceed it on hard reasoning and long context.

Capability index

Relative estimates (0-100) to place this model against its peers, grounded in published benchmarks.

Coding

Reasoning

Math

Multimodal

Long context

Speed

Cost efficiency

How to access it

Available in the OpenAI API via model id 'gpt-4o' (latest snapshot 'gpt-4o-2024-11-20'; earlier 'gpt-4o-2024-08-06' and 'gpt-4o-2024-05-13') and in ChatGPT. The 2024-08-06 snapshot is marked deprecated, but the model remains active.

Get access →Documentation →

Strengths

✓Fast and inexpensive for a capable general-purpose model
✓Multimodal text + image input
✓Strong everyday coding and knowledge performance (HumanEval 90.2, MMLU 88.7)
✓Mature, extremely well-supported across tooling and integrations

Best for developers who...

Fast, low-cost everyday assistant tasksMultimodal (image + text) understandingHigh-volume production workloads

When to choose it (and when not to)

Reach for GPT-4o when...

→Everyday assistant, drafting, summarization, and classification tasks
→Latency- and cost-sensitive applications at scale
→Multimodal tasks needing image understanding with fast responses

Look elsewhere if...

✕Hard reasoning, competition math, or complex agentic coding (use GPT-5-series)
✕Very long documents beyond 128K tokens (GPT-5.4/5.5 offer ~1.05M)
✕Tasks needing knowledge after October 2023

How to use it

›Be explicit and structured; GPT-4o follows direct instructions well without heavy reasoning prompts
›Use cached input ($1.25/1M) for repeated system prompts to cut costs
›For image inputs, ask targeted questions about the image rather than open-ended prompts
›Escalate to a GPT-5-series model when a task needs deeper reasoning or bigger context

Quickstart

Python

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Summarize this email in one sentence."}],
)
print(response.choices[0].message.content)

Pin snapshot 'gpt-4o-2024-11-20' for the latest stable behavior. gpt-4o supports text and image input, text output.

API model id: gpt-4o

Benchmarks

Benchmark	Score	Notes
MMLU	88.7%	General knowledge, per OpenAI's GPT-4o evaluations.
HumanEval	90.2%	Code generation, per OpenAI.
MATH	76.6%	Competition math, per OpenAI's GPT-4o evaluations.

Source: OpenAI - Hello GPT-4o