Open SourceDeepSeekReleased 2024-12

DeepSeek V3

State-of-the-art open-weights model that shocked the industry with frontier performance at minimal cost

Context window

128K

Input / 1M tokens

$0.27

Output / 1M tokens

$1.10

Provider

DeepSeek

Via DeepSeek API; free to self-host (open weights under MIT-style license)

DeepSeek V3 achieved GPT-4o-level performance using only 2,048 H800 GPUs in training, at a reported cost of ~$5.5M. Its Mixture-of-Experts architecture activates only 37B of 671B total parameters per token, keeping inference efficient. Open weights under a permissive license mean you can run it locally or fine-tune without API dependency.

Strengths

  • Near-GPT-4o quality at a fraction of the price
  • Open weights - self-host or fine-tune freely
  • Efficient MoE architecture reduces inference cost
  • Strong coding (Aider polyglot, HumanEval)
  • Good instruction following and structured output

Best for developers who...

Cost-sensitive high-volume inferenceSelf-hosted deploymentsFine-tuning for specialized domainsCoding assistants

Benchmarks

BenchmarkScoreNotes
HumanEval90.2%Matches GPT-4o on coding
MMLU88.5%Matches GPT-4o on knowledge
Aider Polyglot55.0%Strong multi-language coding

Source: DeepSeek V3 technical report

Compare DeepSeek V3 with

All model comparisons →

Learn the concepts