For Developers/Models/DeepSeek V4

Open SourcePreviewDeepSeekReleased 2026-04

DeepSeek V4

Open-weight 1.6T-param MoE frontier model with a 1M-token context built for agents.

Context window

1M tokens

Input / 1M tokens

Free

Output / 1M tokens

Free

Provider

DeepSeek

Open-weight (MIT license) - free to self-host. The official DeepSeek API is also available: deepseek-v4-pro costs approximately $0.435/1M input (cache miss) and $0.87/1M output; cache-hit input is far cheaper (~$0.0036/1M). · Data verified 2026-07-02

DeepSeek V4 (Pro) is an open-weight mixture-of-experts model released in preview on April 24, 2026 under the MIT license. The Pro variant has 1.6T total parameters with 49B active, and defaults to a 1M-token context window with up to 384K output tokens. It introduces DeepSeek Sparse Attention (DSA) plus token-wise compression, cutting long-context compute and KV-cache footprint dramatically versus V3.2. DeepSeek positions it as beating all current open models on math, STEM, and coding while rivaling top closed models on agentic coding.

Capability index

Relative estimates (0-100) to place this model against its peers, grounded in published benchmarks.

Coding

Reasoning

Math

Multimodal

Long context

Speed

Cost efficiency

How to access it

Download open weights from Hugging Face (deepseek-ai/DeepSeek-V4-Pro) to self-host, or call the hosted DeepSeek API with model 'deepseek-v4-pro'. Also servable via inference providers (Together, DeepInfra) and locally via Ollama/vLLM.

Get access →Documentation →

Strengths

✓Open weights under a permissive MIT license, allowing full self-hosting and fine-tuning
✓1M-token context window that stays cost-effective thanks to sparse attention (DSA)
✓Leading open-weight performance on math, STEM, and agentic coding benchmarks
✓Very low API pricing relative to closed frontier models, with aggressive cache-hit discounts
✓Supports tool calls, JSON mode, and chat prefix completion

Best for developers who...

Self-hosted frontier reasoning and codingLong-context agentic workflowsCost-sensitive high-volume inferenceFine-tuning a strong open base model

When to choose it (and when not to)

Reach for DeepSeek V4 when...

→You need a frontier-class open model you can self-host for data control
→Your workload involves very long documents, codebases, or agent trajectories (up to 1M tokens)
→You want top-tier agentic coding at a fraction of closed-model cost
→You need to fine-tune or customize a strong base model

Look elsewhere if...

✕You need a stable, generally-available (GA) model - V4 is still labeled preview
✕You lack the substantial GPU resources required to self-host a 1.6T-param MoE
✕You need native image or audio input (V4 is a text/reasoning model)
✕You require guaranteed enterprise SLAs unavailable from the hosted DeepSeek API

How to use it

›Use the thinking (reasoning) mode for hard math and multi-step agentic tasks; use non-thinking mode for latency-sensitive chat
›Exploit the 1M context for whole-repo or whole-document reasoning, but structure long inputs with clear section markers
›Take advantage of cache-hit pricing by keeping stable system prompts and prefixes across calls
›Provide explicit tool schemas when using tool-calling for agentic workflows

Quickstart

Python

from openai import OpenAI

client = OpenAI(
    api_key="YOUR_DEEPSEEK_API_KEY",
    base_url="https://api.deepseek.com",
)

resp = client.chat.completions.create(
    model="deepseek-v4-pro",
    messages=[{"role": "user", "content": "Refactor this function and explain the change."}],
)
print(resp.choices[0].message.content)

DeepSeek's hosted API is OpenAI-compatible. For self-hosting, load deepseek-ai/DeepSeek-V4-Pro from Hugging Face with vLLM or run via Ollama.

API model id: deepseek-v4-pro

Benchmarks

Benchmark	Score	Notes
SWE-bench Verified	80.6%	Reported for the DeepSeek-V4-Pro-Max configuration as the highest open-weights entry; figure from secondary coverage, not the official docs page.
Math / STEM / Coding (open-model comparison)	Best among open models (per DeepSeek)	DeepSeek's announcement claims V4-Pro beats all current open models on math, STEM, and coding; specific per-benchmark numbers not listed on the announcement page.

Source: DeepSeek V4 Preview announcement (DeepSeek API Docs)

Compare DeepSeek V4

DeepSeek V4 vs Llama 4

Meta - Up to 10M tokens (Scout); ~1M tokens (Maverick) ctx

Compare →

DeepSeek V4 vs Claude Sonnet 5

Anthropic - 1M ctx

Compare →

DeepSeek V4 vs Qwen 3

Alibaba (Qwen Team) - 128K tokens (32K for 0.6B/1.7B/4B dense variants) ctx

Compare →

Compare DeepSeek V4 with any other model

Build a comparison →All model comparisons →

Learn the concepts

MoE (Mixture of Experts)Inference Context Window

← All AI models