DeepSeek V4 Just Made Asia's Open-Source AI Race

DeepSeek V4 Just Made Asia's Open-Source AI Race Look Like A Two-Horse Affair

DeepSeek released a preview of its long-awaited V4 model on Friday, April 24, 2026, and the launch sets a new bar for Asia's open-source AI scene. The Hangzhou-based startup shipped two Mixture-of-Experts variants and pegged Pro-tier output at roughly $3.48 per million tokens, a price point that pulls frontier-grade reasoning into reach for any Asian developer with a credit card.

What V4 Actually Ships

V4 arrives in two sizes. DeepSeek-V4-Pro is a 1.6-trillion-parameter MoE with 49 billion parameters activated per forward pass, while DeepSeek-V4-Flash runs 284 billion total parameters with 13 billion active. Both support a one-million-token context window, which puts the family in roughly the same architectural league as the most recent Claude and Gemini frontier releases.

The company says V4-Pro hits strong scores on reasoning, knowledge, and agentic benchmarks, and the model card on Hugging Face confirms the open-weights release. DeepSeek also claims the model can run autonomously on tasks like writing and debugging multi-file code, an explicit signal that the company is now optimising for agent workflows rather than chat.

V4 is almost on the frontier, at a fraction of the price.
Simon Willison, AI researcher and creator of Datasette

The pricing line is the headline number. V4-Flash output costs $0.28 per million tokens. That is a ten-times-plus discount on comparable Western frontier output and roughly half the price Alibaba and Tencent currently quote on their flagship Qwen and Hunyuan endpoints.

The Huawei Angle Is The Real Story For Asia

The chip story is just as significant as the model. DeepSeek confirmed V4 was trained and is being served on Huawei Ascend 950 silicon clusters knit together by Huawei's Supernode interconnect, with Cambricon providing supporting accelerators. That is the first time a globally competitive frontier-class model has been trained and served end-to-end on Chinese-designed chips.

For Asian cloud buyers, that matters in three ways. First, it removes the Nvidia bottleneck for one specific high-quality option. Second, it gives Huawei a reference customer story it can take to every state-owned enterprise in mainland China and every Belt and Road buyer in Southeast Asia. Third, it sharpens a real divergence between the Nvidia-anchored stack used by most Singapore, Japanese, and Korean operators and the Huawei-anchored stack now usable in the People's Republic.

By The Numbers

$3.48 per million output tokens for V4-Pro, and just $0.28 for V4-Flash, according to DeepSeek's published API rate card
1.6 trillion total parameters in V4-Pro, with 49 billion activated per token via Mixture-of-Experts routing
1,000,000-token context window across both V4 variants, matching the longest commercial context windows on the market
80% combined HBM market share held by Samsung and SK Hynix, the supply that DeepSeek explicitly bypassed for V4 training
$54.6 billion estimated 2026 HBM market size from Bank of America, up 58% year-on-year, signalling how much demand DeepSeek is rerouting

Why This Lands Differently In Asia

DeepSeek's January 2025 release rattled US markets and triggered a brief Nvidia sell-off. V4 is unlikely to do the same to share prices, because the market has already priced in the existence of capable Chinese open models. The shift this time is geographic. The story for Asia is that an open-weights, frontier-tier model with a one-million-token context now exists, costs a fraction of Western alternatives, and runs natively on Asian-made silicon.

That changes procurement maths in Jakarta, Hanoi, and Riyadh in ways it does not change them in San Francisco. A regional bank evaluating a domestic LLM vendor in Vietnam, a government ministry building a translation pipeline in Indonesia, or a university in Thailand wiring up a research assistant can all now consider an option whose total stack, weights, infrastructure, and pricing, has no US dependency. That is a structural shift, not a news cycle.

The catch is governance. V4 weights ship under DeepSeek's open licence, and security teams across the region are already raising questions about provenance, training data, and the absence of Chinese-equivalent safety evaluations. The early consensus from regional security analysts is that V4 will be deployed in air-gapped or self-hosted form by enterprises that want the cost savings but cannot route sensitive data through DeepSeek's hosted API.

Asia now has a credible open-source frontier option that does not require Nvidia or US infrastructure. That is the durable change here, not the benchmark numbers.
Bill Bishop, founder, Sinocism newsletter

Who Wins And Who Has To Move

Alibaba Qwen and Tencent Hunyuan now have a domestic competitor that has shown faster iteration tempo and tighter chip integration. Expect a Qwen 4 push in the coming weeks, almost certainly priced aggressively to defend market share. Singapore's Sea Group, which just announced its own AI Centre of Excellence, will quietly add V4-Flash to its evaluation list because the unit economics are simply better than serving its own homegrown model at scale. Indian enterprises that have been running Sarvam-30B for cost reasons will now have a frontier option for the workloads where Sarvam is too small.

The losers are the second-tier proprietary providers across the region who have been charging premium rates for moderate-capability models. The price floor just dropped, and it dropped hard.

Provider	Output Price (per 1M tokens)	Open Weights	Native Asia Chip Support
DeepSeek V4-Pro	$3.48	Yes	Huawei Ascend 950
DeepSeek V4-Flash	$0.28	Yes	Huawei Ascend 950
Alibaba Qwen 3.5	~$8.00	Partial	Mixed Nvidia/Hanguang
OpenAI GPT-class	$15-60	No	Nvidia only
Anthropic Claude	$15-75	No	Nvidia only

This is the table every Asia-Pacific CIO will print out next week.

For deeper background, see our coverage of Alibaba and Tencent's reported DeepSeek investment, the TSMC profit-jump signal, and our explainer on deploying multi-agent systems in Asia.

The AIinASIA View: V4 is the moment open-source AI in Asia stopped being a curiosity and became a commercial baseline. The combination of frontier-grade reasoning, a million-token context, an open-weights licence, and Huawei silicon underneath gives the region a complete sovereign stack option. For governments and enterprises that have spent two years worried about US export controls, that is the most consequential thing that has happened this quarter. We expect aggressive Qwen and Hunyuan responses inside thirty days, and we expect at least one major ASEAN sovereign model project to quietly pivot to V4-Flash as a backbone before the end of Q2.

Frequently Asked Questions

How does DeepSeek V4 compare to Western frontier models?

DeepSeek says V4-Pro is "almost frontier" on reasoning and agent benchmarks, which independent testers including Simon Willison have broadly confirmed in early evaluations. The big differentiator is price, with V4-Pro running roughly four to ten times cheaper than comparable proprietary models from OpenAI or Anthropic.

Can enterprises run V4 entirely on their own infrastructure?

Yes. The weights are open and published on Hugging Face, and DeepSeek has documented inference recipes for both Huawei Ascend and Nvidia GPU clusters. That self-hosting option is exactly why V4 will land in regulated industries that cannot route data through hosted APIs.

What are the regulatory risks for using V4 in Asia?

The main concerns are data provenance and the lack of Chinese-equivalent third-party safety evaluations. Most analysts expect Asian enterprises outside mainland China to deploy V4 in air-gapped or self-hosted setups while regulators in Singapore, Japan, and India develop their own conformance frameworks.

Does V4 run on Nvidia GPUs as well as Huawei Ascend chips?

Yes. While DeepSeek trained and serves V4 on Huawei Supernode clusters, the open weights are usable on any sufficiently capable accelerator stack, including Nvidia H100/H200 and Blackwell systems. Most Asian cloud providers will run a hybrid setup.

How quickly will Alibaba and Tencent respond?

Both companies have flagship models that are now visibly behind on pricing and context length. Industry watchers expect updated Qwen and Hunyuan releases inside thirty days, almost certainly with aggressive price cuts and longer context windows in response to V4.