Skip to main content

Cookie Consent

We use cookies to enhance your browsing experience, serve personalised ads or content, and analyse our traffic. Learn more

AI in ASIA
Shanghai tech district at golden hour
News

MiniMax M2.5 Undercuts Western AI Labs on Price

A Shanghai lab just matched frontier AI performance at a twentieth of the price. The token economy will never look the same.

Intelligence Desk6 min read

Shanghai's Zhangjiang tech corridor, home to MiniMax and China's AI ambitions

AI Snapshot

The TL;DR: what matters, fast.

MiniMax M2.5 matches Claude and GPT on coding benchmarks at 10-20x lower cost

Five Chinese AI models launched in weeks as the gap with Western labs narrows to days

Asian developers now face a fundamental question about where to spend their AI budgets

Shanghai's Quiet Challenger Just Rewrote the AI Price List

A month ago, MiniMax was a name most people outside China's tech circles had never heard. That changed on 12 February when the Shanghai-based lab released M2.5, an open-weight model that matches or beats the best from Anthropic, Google, and OpenAI on several key benchmarks, at a fraction of the cost.

The timing was deliberate. MiniMax had just completed its Hong Kong IPO, and M2.5 was its opening argument to the global developer market: frontier-class intelligence, priced for mass adoption.

The Numbers That Spooked Silicon Valley

M2.5 scored 80.2% on SWE-Bench Verified, the industry's go-to test for real-world coding ability. That puts it neck-and-neck with Anthropic's Claude Opus 4.6 at 80.8% and ahead of OpenAI's GPT-5.2 at 80.0%. On BrowseComp, which measures web search and retrieval, M2.5 hit 76.3%, outpacing GPT-5.2's 65.8% by a wide margin.

But the real story is the price tag. MiniMax charges $0.30 per million input tokens and $1.20 per million output tokens. That is 10 to 20 times cheaper than comparable offerings from Western labs. For developers building agentic applications that chew through millions of tokens daily, this is not a marginal saving. It is a structural shift.

"AI infrastructure is becoming foundational economic infrastructure. The companies that control cost-per-token will control the next wave of deployment." - Jensen Huang, CEO, NVIDIA

How MiniMax Built a Frontier Model on a Budget

M2.5 is a Mixture of Experts architecture with 230 billion total parameters, but only 10 billion are active during any given inference call. This design means the model carries the knowledge of a much larger system while running with the efficiency of a far smaller one.

The result is a model that is 37% faster on complex tasks than its predecessor M2.1 and uses roughly 20% fewer search and tool iterations on agentic benchmarks. In practical terms, it does more work with fewer calls, which compounds the cost advantage.

By The Numbers

  • 80.2%: M2.5's score on SWE-Bench Verified, matching frontier Western models
  • $0.30 per million input tokens: 10-20x cheaper than comparable Western offerings
  • 230 billion total parameters: But only 10 billion active during inference
  • 76.3% on BrowseComp: Outperforming GPT-5.2 by more than 10 percentage points
  • 42 on Artificial Analysis Intelligence Index: Well above the median of 27 for similar open-weight models

Five Models in Five Weeks

MiniMax is not alone. In the first weeks of March 2026, five major Chinese AI models hit the market from Tencent, Alibaba, Baidu, and ByteDance. The pace is relentless. The lag between Chinese releases and the Western frontier has shrunk from months to weeks, and in some benchmarks, the gap has closed entirely.

Beijing approved Alibaba, ByteDance, and Tencent to order roughly 400,000 NVIDIA H200 chips in January, while simultaneously pushing domestic alternatives from Huawei and Cambricon. China is running a dual-track strategy: buy the best available hardware now, build your own for later.

"The cost per token is still too high for most enterprise deployments in Asia. Models like M2.5 change the equation completely for regional developers." - Kai-Fu Lee, CEO, Sinovation Ventures

The open-source angle matters too. Chinese labs have embraced open weights as a distribution strategy, earning goodwill in global developer communities. More Silicon Valley applications are expected to ship on top of Chinese open models in 2026 than ever before.

Shanghai tech district at golden hour
Inside a Shanghai AI research campus where MiniMax engineers are pushing the boundaries of cost-efficient model design

What This Means for Asian Developers

For startups across Southeast Asia, India, and Japan, MiniMax's pricing removes one of the biggest barriers to building AI-native products. A company in Jakarta or Bangalore that previously budgeted $50,000 a month for API calls can now get equivalent intelligence for $2,500. That changes what is economically viable.

ModelSWE-Bench VerifiedInput Cost (per 1M tokens)Architecture
MiniMax M2.580.2%$0.30MoE, 230B total / 10B active
Claude Opus 4.680.8%$15.00Dense
GPT-5.280.0%$10.00Dense
Gemini 3 Pro78.5%$7.00MoE

The table tells the story. When a $0.30 model performs within a percentage point of a $15.00 model, the conversation shifts from capability to deployment economics. And in markets where margins are thin and developer salaries are lower, deployment economics is everything.

The Catch Nobody Talks About

There are caveats. M2.5's output speed of 39.3 tokens per second sits below the median of 52.6 for comparable models. For latency-sensitive applications like real-time chat, that matters. The model also lacks the extensive safety tuning and alignment infrastructure that Western labs have built over years.

Geopolitics remains the elephant in the room. Enterprise customers in regulated industries may hesitate to build critical systems on Chinese-origin models, regardless of performance. Data sovereignty concerns, export controls, and shifting regulatory landscapes all add friction that raw benchmarks cannot capture.

But for the vast middle market of developers building internal tools, content systems, and lightweight agentic workflows, those concerns are secondary to the question that MiniMax has forced: why pay 20 times more for the same result?

Is MiniMax M2.5 really as good as Claude or GPT?

On coding benchmarks like SWE-Bench Verified, M2.5 performs within one percentage point of both Claude Opus 4.6 and GPT-5.2. However, benchmark scores do not capture everything. Safety alignment, output consistency, and latency vary, and real-world performance depends heavily on the specific use case.

Why is MiniMax so much cheaper than Western models?

The Mixture of Experts architecture uses only 10 billion of its 230 billion parameters during inference, dramatically reducing compute costs. Lower labour costs, aggressive pricing strategy to capture market share, and open-weight distribution also play a role.

Can businesses outside China safely use Chinese AI models?

For non-sensitive applications like content generation, internal tooling, and development assistance, many businesses already do. For regulated industries handling personal data or operating under strict compliance requirements, additional due diligence on data handling and model provenance is advisable.

What does this mean for AI pricing in 2026?

Downward pressure is now structural. Western labs face a choice: match Chinese pricing by improving efficiency, differentiate on safety and alignment, or focus on enterprise features that justify premium pricing. Most will attempt all three.

The AIinASIA View: MiniMax M2.5 is not just another benchmark-topper. It is a pricing signal that will ripple through every AI budget in Asia for the rest of 2026. We have been arguing for months that the real AI race is not about who builds the smartest model but who makes intelligence cheap enough to deploy at population scale. MiniMax just proved that a Shanghai lab with a fraction of the resources can deliver frontier performance at a twentieth of the price. Western labs will respond, probably by slashing prices within weeks. But the damage to the premium pricing model is done. For Asian developers, the golden era of affordable AI has arrived.

If you could build any AI product with tokens at $0.30 per million, what would you build first? Drop your take in the comments below.

YOUR TAKE

We cover the story. You tell us what it means on the ground.

What did you think?

Written by

Share your thoughts

Be the first to share your perspective on this story

This is a developing story

We're tracking this across Asia-Pacific and may update with new developments, follow-ups and regional context.

Liked this? There's more.

Join our weekly newsletter for the latest AI news, tools, and insights from across Asia. Free, no spam, unsubscribe anytime.

No comments yet. Be the first to share your thoughts!

Leave a Comment

Your email will not be published