DeepSeek V4

DeepSeek V4 is an open-weights large language model released by DeepSeek in April 2026, built on a 1.6-trillion-parameter mixture-of-experts architecture with 49 billion active parameters per inference step and a 1-million-token context window. It is available for self-hosted deployment under a permissive open license, through DeepSeek's own API at prices substantially below Western frontier-lab equivalents, and through the consumer chat interface at chat.deepseek.com. As of April 2026, DeepSeek V4 Pro sits in the top ten across major standardized benchmarks, placing it in the leading group of open-weights frontier models alongside Meta AI's Llama 4 while approaching the performance range of closed-source frontier models at a fraction of their cost.

At a glance

Lab: DeepSeek
Released: April 2026 (preview release; V4 Pro and V4 Flash variants)
Modality: Text
Open weights: Yes. Distributed under the MIT license, which permits broad commercial and non-commercial use. Weights available on Hugging Face and GitHub.
Context window: 1,000,000 tokens (V4 Pro)
Pricing: Free for self-hosting; DeepSeek API pricing at platform.deepseek.com is substantially below US frontier-lab equivalents (V4 Pro at approximately $1.74 per million input tokens and $3.48 per million output tokens as of the April 2026 preview). Consumer chat free at chat.deepseek.com.
Distribution channels: Hugging Face deepseek-ai organization, GitHub, DeepSeek API at platform.deepseek.com, consumer chat at chat.deepseek.com

Origins

DeepSeek was established in July 2023 as a research organization funded by High-Flyer, a Chinese quantitative hedge fund co-founded by Liang Wenfeng in 2016 (the parent AGI research lab was announced April 14, 2023; DeepSeek was incorporated July 17, 2023). High-Flyer had built large GPU clusters for trading research by 2020 and 2021; that existing compute base provided the foundation for DeepSeek's first models. Liang structured DeepSeek explicitly outside the trading business so that researchers could focus on architectural and training research without the product-launch pressures that constrain peer labs.

The first DeepSeek models in 2023 and early 2024 were dense and MoE language models in the 7-billion to 67-billion-parameter range, released open-weight and benchmarked against Llama, Mistral, and Alibaba's Qwen line. Specialized DeepSeek-Coder and DeepSeek-Math models shipped alongside the general-purpose releases.

The late-2024 release of DeepSeek-V3 shifted the company's global profile. V3 launched in December 2024 with 671 billion total parameters in a mixture-of-experts configuration (37 billion active) and a reported training cost of approximately $5 to $6 million using H800 GPUs -- dramatically lower than contemporaneous estimates of GPT-4-class training spend. DeepSeek-R1 followed in January 2025, applying large-scale reinforcement learning to V3 to produce a reasoning model that matched OpenAI's o1 on math, code, and reasoning benchmarks. R1 became the most-downloaded free app on the US Apple App Store within weeks of release, and the resulting sell-off in US AI hardware stocks became known in press coverage as the "DeepSeek moment."

The V4 preview, released April 24, 2026, is the next generation in the same lineage: a larger MoE architecture (1.6 trillion total vs. 671 billion in V3), a 1-million-token context window, and the first DeepSeek flagship built for close integration with Huawei's Ascend AI chips alongside the Nvidia-CUDA infrastructure that dominated prior releases. The preview coincided with the company's first reported external fundraising effort -- a $300 million round at a $10-billion-plus valuation -- after three years of funding entirely from High-Flyer profits.

Capabilities

DeepSeek V4 Pro handles text instruction-following, multi-turn dialogue, document analysis, code generation, and mathematical reasoning. The model's architecture makes several design choices that distinguish it from contemporary open-weights releases.

The mixture-of-experts configuration activates 49 billion of 1.6 trillion total parameters per inference step. This provides quality broadly comparable to a much larger dense model while keeping per-token compute cost lower at throughput -- a tradeoff that has become central to efficiency-focused frontier development. V4's attention architecture is a hybrid of Compressed Sparse Attention (CSA) and Heavily Compressed Attention (HCA), branded by DeepSeek as DeepSeek Sparse Attention (DSA), which at the 1M-token context cuts single-token inference FLOPs to roughly 27 percent and key-value-cache memory to 10 percent of V3.2. Other documented innovations include Manifold-Constrained Hyper-Connections (mHC) for residual signal propagation, Multi-Token Prediction (MTP) as an auxiliary training objective, the Muon optimizer for training stability, and FP8 mixed-precision training (with FP4+FP8 for V4 Pro), which contributed to the dramatically lower reported training cost for V3 and V4 relative to peer frontier models.

Integrated reasoning is a core feature of the V4 line. Where R1 was a separate reasoning-specialized release built on V3, V4 merges chat and chain-of-thought reasoning into the flagship, following the convergence seen in OpenAI's o-series and Claude's extended thinking. The model performs extended internal reasoning steps before generating a final response on mathematics, code debugging, and multi-step inference tasks.

DeepSeek V4 Flash is a smaller, lower-cost sibling variant released alongside Pro. Flash uses a 284-billion-parameter mixture-of-experts configuration with 13 billion active parameters and the same 1-million-token context window as Pro, intended for latency-sensitive and cost-sensitive deployments where V4 Pro's throughput and accuracy are not required.

Training and inference for V4 run on Huawei Ascend infrastructure alongside Nvidia GPUs, representing the first time a top-ten frontier model has been built and shipped in close integration with Chinese-made AI silicon. Whether this reduces DeepSeek's dependency on Nvidia CUDA in subsequent releases is one of the most-watched questions in the 2026 AI hardware landscape.

Benchmarks and standing

As of the April 2026 preview release, DeepSeek V4 Pro sits in the top-ten range across the major standardized benchmarks. The Artificial Analysis Intelligence Index places V4 Pro at rank 8 with a composite score of 51.51.

On LMArena's general ELO leaderboard, V4 Pro is ranked 5; on the coding ELO leaderboard, rank 3 with an ELO of 1287; on the vision ELO leaderboard, rank 7. On SWE-bench Verified (software engineering on real repositories), V4 Pro reports 64.2, placing it at rank 3 -- behind Claude Opus 4.7's 74.0 but ahead of most open-weights alternatives. On GPQA Diamond (graduate-level scientific reasoning), 82.1 at rank 6. On HumanEval+ (function-completion coding), 91.2 at rank 3. On the ARC-AGI Challenge, 79.5 at rank 5. On AIME 2025 (advanced mathematics competition), 85.0 at rank 5.

These positions place V4 Pro close to the closed-source frontier (GPT-5.5 scores 60.24; Claude Opus 4.7 scores 57.28; Gemini 3.1 Pro scores 57.18) while trailing by 6 to 9 points on the composite. The gap is smaller on individual task benchmarks, particularly coding (SWE-bench rank 3, HumanEval+ rank 3) and reasoning (AIME rank 5, ARC-AGI rank 5), where V4 Pro competes with or exceeds several closed-frontier models.

Against open-weights peers, V4 Pro leads Llama 4 Maverick (mid-40s on the Intelligence Index) and is the top-ranked open-weights model on SWE-bench and HumanEval+ at the preview release. Qwen 3 is the closest Chinese open-weights competitor.

Benchmark positions are point-in-time. Leadership rotates on the scale of weeks given the release cadence through 2026.

Access and pricing

DeepSeek V4 weights are distributed through the deepseek-ai organization on Hugging Face. The weights are released under the MIT license, which permits broad commercial and non-commercial use with minimal restrictions. The GitHub repository at github.com/deepseek-ai contains inference code and documentation.

The DeepSeek API is available at platform.deepseek.com. Per-token API pricing for V4 Pro at launch is approximately $1.74 per million input tokens and $3.48 per million output tokens -- substantially below the equivalent pricing tiers from OpenAI, Anthropic, and Google. The cost differential is the primary commercial argument for DeepSeek V4 in developer and enterprise contexts where closed-frontier capability is acceptable but pricing is a constraint.

The consumer interface at chat.deepseek.com provides free access to V4, including the integrated reasoning mode. At R1's launch in January 2025, demand exceeded DeepSeek's initial inference capacity and caused service disruptions; V4 has seen similar high-traffic patterns at preview.

For self-hosted deployment, V4 Pro's 49-billion active-parameter profile requires substantial GPU memory but is tractable on multi-GPU server configurations using standard inference tooling such as vLLM and TGI.

Comparison

Direct competitors to DeepSeek V4 Pro as of April 2026:

GPT-5.5 (OpenAI). Composite benchmark leader at 60.24 on the Intelligence Index, roughly 9 points above V4 Pro. GPT-5.5 is closed-weights at OpenAI's standard API pricing. V4 Pro's differentiating factor is open-weights distribution and the API price gap: organizations that can tolerate V4's benchmark profile gain full fine-tuning control and dramatically lower per-token cost.
Claude Opus 4.7 (Anthropic). Second on the Intelligence Index at 57.28, with a SWE-bench Verified score of 74.0 versus V4 Pro's 64.2 -- the largest per-benchmark gap between the two models. For coding-intensive workloads, Claude Opus 4.7 holds a measurable advantage. Claude is closed-weights.
Gemini 3.1 Pro (Google DeepMind). Third on the Intelligence Index at 57.18, with a 2-million-token context window and native Google Search grounding for real-time web access -- capabilities V4 Pro does not match in standard deployment.
Llama 4 (Meta AI). The primary open-weights peer from a Western lab. Llama 4 Maverick uses approximately 400-billion-parameter MoE with 17 billion active across 128 experts; V4 Pro uses 1.6 trillion total with 49 billion active. V4 Pro leads Maverick on most benchmark categories. The choice between them turns on supply-chain and regulatory considerations (Chinese-origin versus US-origin), ecosystem depth, and benchmark preference.
Qwen 3 (Alibaba). The closest Chinese open-weights peer. Qwen 3 benchmarks competitively with Llama 4 Maverick and shows multilingual strength across Asian languages. V4 Pro leads on most English-language benchmarks, but the gap narrows on reasoning and math.

DeepSeek V4's distinctive position: cost-efficiency leader on API pricing, top-ranked open-weights model on coding and reasoning benchmarks, and China's flagship open-weights frontier release -- a combination no other model in the comparison set matches.

Outlook

Open questions for DeepSeek V4 and the broader DeepSeek trajectory over the next 6 to 18 months:

V5 timeline. Whether V5 continues the MoE scaling path or adopts a hybrid reasoning architecture. The V3-to-V4 cycle ran roughly 16 months; a similar cadence would put V5 in late 2027.
US export-control constraints. DeepSeek trained V3 on H800 GPUs; V4 incorporated Huawei Ascend chips alongside Nvidia hardware. Whether Ascend can sustain frontier-scale training for V5 without Nvidia hardware is unresolved. If further US controls close remaining Nvidia channels, the Ascend integration in V4 may prove a critical hedge.
Persistence of the cost-efficiency lead. US frontier labs have invested heavily in training-efficiency research since R1's release. Whether DeepSeek's algorithmic lead persists through V5 and beyond is a central competitive question.
Regulatory and enterprise-buyer risk in Western markets. Several US federal agencies and European governments have restricted or flagged DeepSeek models on supply-chain and data-residency grounds. The trajectory of these restrictions through 2026 and 2027 will shape the addressable market for DeepSeek's commercial API in regulated sectors.
The external funding round. The reported $300 million round at a $10-billion-plus valuation is DeepSeek's first external capital. The identity of lead investors -- and any sovereign-fund or strategic-corporate participation -- will be a signal about the company's independence as a research body.

Sources

DeepSeek: DeepSeek-V4 Technical Report. Official technical report and model weights repository.
Hugging Face: deepseek-ai organization. Model weights, model cards, and release documentation for V4 Pro, V4 Flash, and prior DeepSeek releases.
TechCrunch: DeepSeek previews new AI model that 'closes the gap' with frontier models. April 2026 V4 preview coverage.
Fortune: DeepSeek unveils V4 model, with rock-bottom prices and close integration with Huawei's chips. V4 release pricing and Huawei Ascend integration.
MIT Technology Review: Three reasons why DeepSeek's new model matters. Strategic context for V4 in the US-China AI competition framing.
CNBC: China's DeepSeek releases preview of long-awaited V4 model. V4 preview release news coverage.
DeepSeek API documentation. Official API pricing and access documentation for platform.deepseek.com.
Artificial Analysis Intelligence Index. Composite benchmark scores; April 2026 data used in this profile.
Wikipedia: DeepSeek. Company history, model lineage, and the January 2025 R1 market-impact coverage.

DeepSeek V4

At a glance

Origins

Capabilities

Benchmarks and standing

Access and pricing

Comparison

Outlook

Sources

Nextomoro

AI Research Lab Intelligence

DeepSeek V4

At a glance

Origins

Capabilities

Benchmarks and standing

Access and pricing

Comparison

Outlook

Sources

Nextomoro

QwQ-32B

Qwen3 Coder 480B-A35B

MiniMax M2

Kimi K2.5

Qwen 3.6

AI Research Lab Intelligence