DeepSeek V4
DeepSeek V4 is an open-weights large language model released by DeepSeek in April 2026, built on a 1.6-trillion-parameter mixture-of-experts architecture with 49 billion active parameters per inference step and a 1-million-token context window. It is available for self-hosted deployment under a permissive open license, through DeepSeek's own API at prices substantially below Western frontier-lab equivalents, and through the consumer chat interface at chat.deepseek.com. As of April 2026, DeepSeek V4 Pro sits in the top ten across major standardized benchmarks, placing it in the leading group of open-weights frontier models alongside Meta AI's Llama 4 while approaching the performance range of closed-source frontier models at a fraction of their cost.
At a glance
- Lab: DeepSeek
- Released: April 2026 (preview release; V4 Pro and V4 Flash variants)
- Modality: Text
- Open weights: Yes. Distributed under DeepSeek's open model license, which permits broad commercial and non-commercial use. Weights available on Hugging Face and GitHub.
- Context window: 1,000,000 tokens (V4 Pro)
- Pricing: Free for self-hosting; DeepSeek API pricing at platform.deepseek.com is substantially below US frontier-lab equivalents (input tokens priced at approximately $0.27 per million, output at $1.10 per million for V4 Pro as of release -- an order of magnitude below comparable closed-frontier API tiers). Consumer chat free at chat.deepseek.com.
- Distribution channels: Hugging Face deepseek-ai organization, GitHub, DeepSeek API at platform.deepseek.com, consumer chat at chat.deepseek.com
Origins
DeepSeek was established in April 2023 as a research organization funded by High-Flyer, a Chinese quantitative hedge fund co-founded by Liang Wenfeng in 2016. High-Flyer had built large GPU clusters for trading research by 2020 and 2021; that existing compute base provided the foundation for DeepSeek's first models. Liang structured DeepSeek explicitly outside the trading business so that researchers could focus on architectural and training research without the product-launch pressures that constrain peer labs.
The first DeepSeek models in 2023 and early 2024 were dense and MoE language models in the 7-billion to 67-billion-parameter range, released open-weight and benchmarked against Llama, Mistral, and Alibaba's Qwen line. Specialized DeepSeek-Coder and DeepSeek-Math models shipped alongside the general-purpose releases.
The late-2024 release of DeepSeek-V3 shifted the company's global profile. V3 launched in December 2024 with 671 billion total parameters in a mixture-of-experts configuration (37 billion active) and a reported training cost of approximately $5 to $6 million using H800 GPUs -- dramatically lower than contemporaneous estimates of GPT-4-class training spend. DeepSeek-R1 followed in January 2025, applying large-scale reinforcement learning to V3 to produce a reasoning model that matched OpenAI's o1 on math, code, and reasoning benchmarks. R1 became the most-downloaded free app on the US Apple App Store within weeks of release, and the resulting sell-off in US AI hardware stocks became known in press coverage as the "DeepSeek moment."
The V4 preview, released April 24, 2026, is the next generation in the same lineage: a larger MoE architecture (1.6 trillion total vs. 671 billion in V3), a 1-million-token context window, and the first DeepSeek flagship built for close integration with Huawei's Ascend AI chips alongside the Nvidia-CUDA infrastructure that dominated prior releases. The preview coincided with the company's first reported external fundraising effort -- a $300 million round at a $10-billion-plus valuation -- after three years of funding entirely from High-Flyer profits.
Capabilities
DeepSeek V4 Pro handles text instruction-following, multi-turn dialogue, document analysis, code generation, and mathematical reasoning. The model's architecture makes several design choices that distinguish it from contemporary open-weights releases.
The mixture-of-experts configuration activates 49 billion of 1.6 trillion total parameters per inference step. This provides quality broadly comparable to a much larger dense model while keeping per-token compute cost lower at throughput -- a tradeoff that has become central to efficiency-focused frontier development. The architecture builds on innovations documented in DeepSeek's published technical reports: Multi-head Latent Attention (MLA), which reduces the memory cost of the key-value cache during inference; Multi-Token Prediction (MTP) as an auxiliary training objective, which improves token generation efficiency; and FP8 mixed-precision training, which reduces GPU memory bandwidth requirements during the training run and contributed to the dramatically lower reported training cost for V3 and V4 relative to peer frontier models.
Integrated reasoning is a core feature of the V4 line. Where R1 was a separate reasoning-specialized release built on V3, V4 merges chat and chain-of-thought reasoning into the flagship, following the convergence seen in OpenAI's o-series and Claude's extended thinking. The model performs extended internal reasoning steps before generating a final response on mathematics, code debugging, and multi-step inference tasks.
DeepSeek V4 Flash is a smaller, lower-cost sibling variant released alongside Pro. Flash is intended for latency-sensitive and cost-sensitive deployments where V4 Pro's throughput and accuracy are not required. The relative parameter counts of Flash have not been publicly specified as of the April 2026 preview.
Training and inference for V4 run on Huawei Ascend infrastructure alongside Nvidia GPUs, representing the first time a top-ten frontier model has been built and shipped in close integration with Chinese-made AI silicon. Whether this reduces DeepSeek's dependency on Nvidia CUDA in subsequent releases is one of the most-watched questions in the 2026 AI hardware landscape.
Benchmarks and standing
As of the April 2026 preview release, DeepSeek V4 Pro sits in the top-ten range across the major standardized benchmarks. The Artificial Analysis Intelligence Index places V4 Pro at rank 8 with a composite score of 51.51.
On LMArena's general ELO leaderboard, V4 Pro is ranked 5; on the coding ELO leaderboard, rank 3 with an ELO of 1287; on the vision ELO leaderboard, rank 7. On SWE-bench Verified (software engineering on real repositories), V4 Pro reports 64.2, placing it at rank 3 -- behind Claude Opus 4.7's 74.0 but ahead of most open-weights alternatives. On GPQA Diamond (graduate-level scientific reasoning), 82.1 at rank 6. On HumanEval+ (function-completion coding), 91.2 at rank 3. On the ARC-AGI Challenge, 79.5 at rank 5. On AIME 2025 (advanced mathematics competition), 85.0 at rank 5.
These positions place V4 Pro close to the closed-source frontier (GPT-5.5 scores 60.24; Claude Opus 4.7 scores 57.28; Gemini 3.1 Pro scores 57.18) while trailing by 6 to 9 points on the composite. The gap is smaller on individual task benchmarks, particularly coding (SWE-bench rank 3, HumanEval+ rank 3) and reasoning (AIME rank 5, ARC-AGI rank 5), where V4 Pro competes with or exceeds several closed-frontier models.
Against open-weights peers, V4 Pro leads Llama 4 Maverick (mid-40s on the Intelligence Index) and is the top-ranked open-weights model on SWE-bench and HumanEval+ at the preview release. Qwen 3 is the closest Chinese open-weights competitor.
Benchmark positions are point-in-time. Leadership rotates on the scale of weeks given the release cadence through 2026.
Access and pricing
DeepSeek V4 weights are distributed through the deepseek-ai organization on Hugging Face. The weights are released under DeepSeek's open model license, which permits broad commercial and non-commercial use with minimal restrictions. The GitHub repository at github.com/deepseek-ai contains inference code and documentation.
The DeepSeek API is available at platform.deepseek.com. Per-token API pricing for V4 Pro at launch is approximately $0.27 per million input tokens and $1.10 per million output tokens -- consistently cited in press coverage as an order of magnitude below the equivalent pricing tiers from OpenAI, Anthropic, and Google. The cost differential is the primary commercial argument for DeepSeek V4 in developer and enterprise contexts where closed-frontier capability is acceptable but pricing is a constraint.
The consumer interface at chat.deepseek.com provides free access to V4, including the integrated reasoning mode. At R1's launch in January 2025, demand exceeded DeepSeek's initial inference capacity and caused service disruptions; V4 has seen similar high-traffic patterns at preview.
For self-hosted deployment, V4 Pro's 49-billion active-parameter profile requires substantial GPU memory but is tractable on multi-GPU server configurations using standard inference tooling such as vLLM and TGI.
Comparison
Direct competitors to DeepSeek V4 Pro as of April 2026:
- GPT-5.5 (OpenAI). Composite benchmark leader at 60.24 on the Intelligence Index, roughly 9 points above V4 Pro. GPT-5.5 is closed-weights at OpenAI's standard API pricing. V4 Pro's differentiating factor is open-weights distribution and the API price gap: organizations that can tolerate V4's benchmark profile gain full fine-tuning control and dramatically lower per-token cost.
- Claude Opus 4.7 (Anthropic). Second on the Intelligence Index at 57.28, with a SWE-bench Verified score of 74.0 versus V4 Pro's 64.2 -- the largest per-benchmark gap between the two models. For coding-intensive workloads, Claude Opus 4.7 holds a measurable advantage. Claude is closed-weights.
- Gemini 3.1 Pro (Google DeepMind). Third on the Intelligence Index at 57.18, with a 2-million-token context window and native Google Search grounding for real-time web access -- capabilities V4 Pro does not match in standard deployment.
- Llama 4 (Meta AI). The primary open-weights peer from a Western lab. Llama 4 Maverick uses a 109-billion-parameter MoE with 17 billion active; V4 Pro uses 1.6 trillion total with 49 billion active. V4 Pro leads Maverick on most benchmark categories. The choice between them turns on supply-chain and regulatory considerations (Chinese-origin versus US-origin), ecosystem depth, and benchmark preference.
- Qwen 3 (Alibaba). The closest Chinese open-weights peer. Qwen 3 benchmarks competitively with Llama 4 Maverick and shows multilingual strength across Asian languages. V4 Pro leads on most English-language benchmarks, but the gap narrows on reasoning and math.
DeepSeek V4's distinctive position: cost-efficiency leader on API pricing, top-ranked open-weights model on coding and reasoning benchmarks, and China's flagship open-weights frontier release -- a combination no other model in the comparison set matches.
Outlook
Open questions for DeepSeek V4 and the broader DeepSeek trajectory over the next 6 to 18 months:
- V5 timeline. Whether V5 continues the MoE scaling path or adopts a hybrid reasoning architecture. The V3-to-V4 cycle ran roughly 16 months; a similar cadence would put V5 in late 2027.
- US export-control constraints. DeepSeek trained V3 on H800 GPUs; V4 incorporated Huawei Ascend chips alongside Nvidia hardware. Whether Ascend can sustain frontier-scale training for V5 without Nvidia hardware is unresolved. If further US controls close remaining Nvidia channels, the Ascend integration in V4 may prove a critical hedge.
- Persistence of the cost-efficiency lead. US frontier labs have invested heavily in training-efficiency research since R1's release. Whether DeepSeek's algorithmic lead persists through V5 and beyond is a central competitive question.
- Regulatory and enterprise-buyer risk in Western markets. Several US federal agencies and European governments have restricted or flagged DeepSeek models on supply-chain and data-residency grounds. The trajectory of these restrictions through 2026 and 2027 will shape the addressable market for DeepSeek's commercial API in regulated sectors.
- The external funding round. The reported $300 million round at a $10-billion-plus valuation is DeepSeek's first external capital. The identity of lead investors -- and any sovereign-fund or strategic-corporate participation -- will be a signal about the company's independence as a research body.
Sources
- DeepSeek: DeepSeek-V4 Technical Report. Official technical report and model weights repository.
- Hugging Face: deepseek-ai organization. Model weights, model cards, and release documentation for V4 Pro, V4 Flash, and prior DeepSeek releases.
- TechCrunch: DeepSeek previews new AI model that 'closes the gap' with frontier models. April 2026 V4 preview coverage.
- Fortune: DeepSeek unveils V4 model, with rock-bottom prices and close integration with Huawei's chips. V4 release pricing and Huawei Ascend integration.
- MIT Technology Review: Three reasons why DeepSeek's new model matters. Strategic context for V4 in the US-China AI competition framing.
- CNBC: China's DeepSeek releases preview of long-awaited V4 model. V4 preview release news coverage.
- DeepSeek API documentation. Official API pricing and access documentation for platform.deepseek.com.
- Artificial Analysis Intelligence Index. Composite benchmark scores; April 2026 data used in this profile.
- Wikipedia: DeepSeek. Company history, model lineage, and the January 2025 R1 market-impact coverage.