V4 represents DeepSeek’s most consequential release since its groundbreaking R1 reasoning model, launched in January 2025. R1, developed under resource constraints, captivated the global AI landscape with its exceptional performance and efficiency, propelling DeepSeek from an obscure research entity to China’s preeminent AI company almost overnight. This success also catalyzed a surge of open-weight model releases from various Chinese AI firms, significantly shaping the competitive dynamics of the industry.

Following R1’s triumphant debut, DeepSeek maintained a relatively understated presence. However, earlier this month, the company strategically hinted at V4’s imminent arrival by introducing "expert" and "flash" modes to the online iteration of its model. These enhancements fueled speculation about a forthcoming major release, building anticipation within the AI research and development circles.

While DeepSeek has emerged as a potent symbol of China’s burgeoning AI ambitions, its return to the forefront of cutting-edge frontier models arrives amidst a period of heightened scrutiny. This includes notable personnel departures, delays in previous model launches, and increasing oversight from both the United States and Chinese governments, underscoring the complex geopolitical and economic factors influencing the AI sector.

The question lingers: will V4 replicate the seismic impact of R1 on the AI field? While an exact parallel is unlikely, this release holds considerable significance for three compelling reasons.

1. It breaks new ground for an open-source model.

Echoing the precedent set by R1, DeepSeek asserts that V4’s performance benchmarks rival those of leading proprietary models, yet at a substantially reduced cost. This proposition is a boon for developers and businesses alike, democratizing access to advanced AI capabilities on their own terms and mitigating concerns about escalating expenses.

The V4 model is offered in two distinct versions, both readily available through DeepSeek’s website and its dedicated app, with API access also extended to developers. V4-Pro, the larger variant, is meticulously engineered for coding and intricate agentic tasks. Complementing it is V4-Flash, a more compact version optimized for speed and cost-effectiveness in deployment. Both iterations incorporate sophisticated reasoning modes, enabling the model to meticulously dissect user prompts and articulate its problem-solving process step-by-step, thereby enhancing transparency and user understanding.

For developers and enterprises utilizing V4-Pro, DeepSeek has established a competitive pricing structure: $1.74 per million input tokens and $3.48 per million output tokens. These rates represent a fraction of the costs associated with comparable models from industry giants like OpenAI and Anthropic, making advanced AI more accessible. V4-Flash further lowers the barrier to entry, priced at approximately $0.14 per million input tokens and $0.28 per million output tokens, positioning it as one of the most economically viable top-tier models available for application development.

In terms of raw performance, V4 demonstrates a monumental leap from R1, presenting a formidable alternative to the latest generation of large AI models. According to company-provided benchmark results, V4-Pro demonstrates performance on par with leading closed-source models, including Anthropic’s Claude-Opus-4.6, OpenAI’s GPT-5.4, and Google’s Gemini-3.1. When contrasted with other open-source contenders such as Alibaba’s Qwen-3.5 and Z.ai’s GLM-5.1, DeepSeek V4 consistently outperforms them across coding, mathematics, and STEM-related challenges, solidifying its status as one of the most potent open-source models ever released.

Furthermore, DeepSeek reports that V4-Pro now ranks among the leading open-source models for agentic coding tasks and exhibits strong capabilities in executing multi-step problems. The model’s writing proficiency and breadth of world knowledge also stand out, according to the company’s benchmark data. Reinforcing these claims, an internal survey of 85 experienced developers, detailed in a technical report accompanying the model’s release, revealed that over 90% ranked V4-Pro among their preferred choices for coding applications. DeepSeek has also specifically tailored V4 for seamless integration with popular agent frameworks like Claude Code, OpenClaw, and CodeBuddy.

2. It delivers on a new approach to memory efficiency.

A cornerstone innovation of V4 is its expansive context window, the maximum amount of text the model can process concurrently. Both versions boast a remarkable 1 million token capacity, sufficient to encompass the entirety of J.R.R. Tolkien’s The Lord of the Rings trilogy and The Hobbit. This substantial context window size is now the standard across all DeepSeek services, matching the capabilities of state-of-the-art models from competitors like Gemini and Claude.

The significance lies not merely in the expanded context window but in the innovative methodology employed by DeepSeek. V4 incorporates substantial architectural modifications to its predecessors, particularly within the attention mechanism—the critical component that enables AI models to discern the relationships between different parts of a prompt. As prompt length increases, the computational cost of these inter-token comparisons escalates, making attention a primary bottleneck for models with long context windows.

DeepSeek’s breakthrough lies in its model’s enhanced selectivity in attention. Rather than assigning equal importance to all preceding text, V4 employs a strategy of compressing older information while prioritizing elements most relevant to the current context. This selective focus ensures that crucial details are not lost, even as the model efficiently manages extensive inputs.

This approach dramatically reduces the computational overhead associated with long-context utilization. In a 1-million-token context, V4-Pro consumes only 27% of the computing power and 10% of the memory required by its previous iteration, V3.2. V4-Flash achieves even greater efficiencies, utilizing a mere 10% of the computing power and 7% of the memory. In practical terms, this translates to significantly reduced costs for developing applications that necessitate processing vast datasets, such as AI coding assistants capable of analyzing entire codebases or research agents designed to sift through extensive document archives without succumbing to memory limitations. DeepSeek’s focus on long context windows is not a recent development; the company has consistently published research on AI memory and information retention over the past eighteen months, exploring compression and mathematical techniques to extend AI’s processing capabilities.

3. It marks the first steps on the hard road away from Nvidia.

V4 is DeepSeek’s inaugural model meticulously optimized for domestic Chinese chip architectures, including Huawei’s Ascend processors. This strategic move positions the launch as a pivotal test for China’s indigenous AI industry, gauging its capacity to diminish reliance on U.S. chip giant Nvidia.

This development was anticipated, following reports that DeepSeek deliberately withheld early access to V4 from American chipmakers like Nvidia and AMD, a common practice to facilitate hardware optimization prior to a model’s release. Instead, the company reportedly granted exclusive early access to Chinese chip manufacturers. Huawei has confirmed that its Ascend supernode products, based on the Ascend 950 series, will fully support DeepSeek V4. This collaboration ensures that companies and individuals opting to run customized versions of DeepSeek V4 can seamlessly integrate Huawei’s hardware.

Prior reports indicated that Chinese government officials had encouraged DeepSeek to integrate Huawei chips into its training pipeline, aligning with broader national objectives of technological self-reliance. This directive is particularly critical for the AI sector, where U.S. export controls since 2022 have severely restricted Chinese firms’ access to Nvidia’s most advanced chips, and more recently, even downgraded versions. Beijing’s response has been to accelerate the development of a comprehensive domestic AI ecosystem, encompassing hardware, software, and data infrastructure.

Chinese authorities have actively promoted the adoption of domestic chips within data centers and public computing projects, implementing measures such as bans on foreign-made chips, sourcing quotas, and mandates to pair Nvidia chips with Chinese alternatives from companies like Huawei and Cambricon.

However, supplanting Nvidia presents a formidable challenge that extends beyond mere hardware substitution. Nvidia’s dominance stems not only from its superior chip technology but also from its mature software ecosystem, meticulously cultivated by developers over many years. Transitioning to Huawei’s Ascend chips necessitates adapting model code, redeveloping essential tools, and validating the stability and reliability of systems built upon this new hardware foundation.

It is important to note that DeepSeek has not entirely divested from Nvidia. While the company’s technical report indicates the use of Chinese chips for model inference—the process of generating outputs in response to user queries—analysis suggests that only a portion of V4’s training process has been adapted for Chinese silicon. The report’s ambiguity regarding the adaptation of key long-context features for domestic chips raises the possibility that V4 was primarily trained on Nvidia hardware. Multiple anonymous sources, citing the political sensitivity of the matter, informed MIT Technology Review that while Chinese chips lag behind Nvidia in overall performance, they are increasingly viable for inference tasks.

DeepSeek is strategically linking the future cost structure of V4 to this hardware transition. The company anticipates a significant reduction in V4-Pro pricing once Huawei’s Ascend 950 supernodes achieve large-scale deployment in the latter half of the current year. Should this initiative prove successful, V4 could serve as an early indicator of China’s progress in establishing a parallel and independent AI infrastructure.