Skip to main content

We use cookies to enhance your experience. By continuing to visit this site you agree to our use of cookies. Cookie Policy

AI in ASIA
Business

AI's inner workings baffle experts at major summit

At the world's largest AI conference, tech giants admit their most advanced models remain mysterious black boxes even to their creators.

Intelligence DeskIntelligence Desk3 min read

AI Snapshot

The TL;DR: what matters, fast.

NeurIPS 2024 drew record 26,000 attendees as AI transformed from academic niche to industrial powerhouse

Google, OpenAI admit limited understanding of their most advanced AI models' internal mechanisms

Martian offers £790,000 prize to accelerate AI interpretability research breakthroughs

AI's Black Box Problem Takes Centre Stage at World's Biggest AI Conference

The annual Neural Information Processing Systems (NeurIPS) conference drew a record 26,000 attendees to San Diego, doubling attendance from just six years ago. This explosive growth mirrors AI's transformation from academic niche to global industrial powerhouse. Yet despite the proliferation of highly specialised topics, one fundamental question dominated discussions: how do frontier AI systems actually work?

Google, OpenAI, and other tech giants found themselves admitting an uncomfortable truth. Their most advanced AI models remain largely opaque, even to their creators. This pursuit of understanding AI's internal mechanisms, known as interpretability, has become the field's most pressing challenge.

The Great AI Mystery Deepens

A surprising consensus emerged among leading AI researchers and CEOs: they have limited understanding of how today's most advanced AI models function internally. Shriyash Upadhyay, co-founder of Martian, an interpretability-focused company, compared the field to early physics when scientists still questioned whether particles like electrons existed and could be measured.

Advertisement

"We're at the stage where we're asking what it truly means to have an interpretable system," said Shriyash Upadhyay, AI researcher and co-founder of Martian. "It's like the early days of physics when fundamental questions about particles were still being posed."

Martian has launched a £790,000 prize to accelerate progress in this area. The paradox is stark: whilst core mechanisms of large language models remain opaque, demand for them soars. Companies like OpenAI experience unprecedented growth, with rival systems like Gemini rapidly gaining users.

The implications extend far beyond academic curiosity. As AI safety experts flee major companies, questions about AI interpretability become increasingly urgent for both developers and society at large.

By The Numbers

  • 26,000 attendees at NeurIPS 2024, double the number from six years ago
  • £790,000 prize offered by Martian for interpretability breakthroughs
  • Founded in 1987, NeurIPS has grown from academic conference to mainstream AI summit
  • Fourth consecutive year for NeurIPS AI for science offshoot event
  • Multiple major AI firms now dedicated interpretability teams

Tech Giants Split on Strategy

The conference revealed diverging approaches among major AI firms. Google's team announced a significant pivot away from ambitious "near-complete reverse-engineering" goals. Neel Nanda, a Google interpretability leader, acknowledged these comprehensive approaches are currently out of reach.

Instead, Google is focusing on practical, impact-driven methods with tangible results expected within a decade. This shift recognises AI's rapid development pace and the limited success of earlier reverse-engineering attempts.

"We're moving away from near-complete reverse-engineering goals because they're currently out of reach," explained Neel Nanda, Google interpretability leader. "We're focusing on practical methods that can deliver results within a decade."

OpenAI takes the opposite approach. Leo Gao, OpenAI's head of interpretability, declared commitment to deeper, more ambitious interpretability goals. The company aims for full understanding of neural network operations, tackling complexity head-on despite uncertain short-term success.

Company Approach Timeline Focus
Google Practical impact-driven Within decade Tangible results
OpenAI Deep comprehensive Long-term Full understanding
FAR.AI Behavioural analysis Ongoing Meaningful progress

Some experts remain sceptical about complete interpretability. Adam Gleave from FAR.AI believes deep learning models may be inherently too complex for simple human comprehension. However, he remains optimistic about understanding model behaviour at various levels.

Measurement Tools Lag Behind AI Capabilities

Beyond understanding how AI works, researchers struggle with inadequate evaluation methods. Current measurement tools fail to assess complex concepts like intelligence and reasoning in modern AI systems.

Sanmi Koyejo from Stanford University's Trustworthy AI Research Lab highlighted this gap. Many existing benchmarks were designed for earlier AI models, focusing on specific, narrower tasks. Today's advanced AI capabilities demand new, reliable tests for accurate assessment.

The challenge is particularly acute in specialised fields. Ziv Bar-Joseph from Carnegie Mellon University and founder of GenBio AI described biological AI evaluation as being in "extremely, extremely early stages."

This measurement problem affects AI adoption across industries, where organisations struggle to evaluate AI performance for specific use cases. The lack of robust evaluation metrics creates uncertainty for businesses investing in AI solutions.

  • Existing benchmarks focus on narrow tasks unsuitable for general AI assessment
  • Specialised fields like biology lack proper evaluation frameworks
  • New testing methods needed for advanced reasoning and intelligence measures
  • Current tools inadequate for assessing real-world AI applications
  • Gap between AI capabilities and measurement sophistication continues widening

Science Accelerates Despite AI Opacity

Despite interpretability challenges, AI systems prove powerful tools for scientific research. Upadhyay noted that "people built bridges before Isaac Newton figured out physics," highlighting how practical application often precedes theoretical comprehension.

For the fourth consecutive year, a NeurIPS offshoot focused on AI's role in scientific discovery. Ada Fang, a Harvard PhD student researching AI in chemistry, called this year's event a "great success." She emphasised shared challenges and ideas across diverse scientific domains applying AI.

Jeff Clune from the University of British Columbia observed dramatic shifts in enthusiasm for AI-driven scientific discovery. Interest in creating AI that can learn, discover, and innovate for science has gone "through the roof," contrasting sharply with a decade ago when the field was largely overlooked.

The momentum suggests AI is positioned to tackle humanity's most pressing scientific problems. This aligns with broader trends in Asia's sovereign AI investments, where governments recognise AI's potential for national competitiveness.

Frequently Asked Questions

What is AI interpretability and why does it matter?

AI interpretability refers to understanding how AI systems make decisions and process information internally. It's crucial for building trustworthy AI systems, identifying potential biases, and ensuring AI behaves safely in critical applications.

Why don't AI developers understand their own systems?

Modern AI systems like large language models are incredibly complex, with billions of parameters interacting in ways that aren't easily traceable. They're trained on vast datasets through processes that create emergent behaviours difficult to predict or explain.

How are different companies approaching AI interpretability?

Google focuses on practical, short-term solutions with measurable impact, whilst OpenAI pursues deeper, more comprehensive understanding of neural networks. Other companies take behavioural analysis approaches, studying what AI does rather than how it works internally.

What are the main challenges in measuring AI capabilities?

Current benchmarks were designed for simpler AI systems and don't adequately test modern capabilities like reasoning, creativity, or general intelligence. New evaluation methods are needed, especially for specialised applications in fields like biology or medicine.

Can AI be useful for science without full interpretability?

Yes, AI is already accelerating scientific discovery across multiple fields. Historical precedent shows practical applications often precede complete theoretical understanding, similar to how bridges were built before physics was fully developed.

The AIinASIA View: The interpretability crisis reveals a fundamental tension in AI development. We're deploying increasingly powerful systems without understanding their inner workings, creating both unprecedented opportunities and risks. The divergent strategies between tech giants reflect genuine uncertainty about the best path forward. However, we believe this challenge will drive innovation in AI safety and evaluation methods. The key is maintaining momentum in practical applications whilst building interpretability capabilities. Asia's growing AI investments must prioritise interpretability research alongside capability development to ensure long-term competitiveness and safety.

The rapid evolution of AI necessitates continuous re-evaluation of how we understand and assess these powerful tools. Whilst interpretability and robust measurement remain significant hurdles, AI's potential to drive innovation, particularly in scientific research, is undeniable. As the field grapples with these fundamental questions, the stakes continue rising for both developers and society.

What's your view on the trade-off between AI capability and interpretability? Should we slow development until we better understand these systems, or continue advancing whilst building interpretability in parallel? Drop your take in the comments below.

YOUR TAKE

We cover the story. You tell us what it means on the ground.

What did you think?

Share your thoughts

Join 5 readers in the discussion below

This is a developing story

We're tracking this across Asia-Pacific and may update with new developments, follow-ups and regional context.

Advertisement

Advertisement

This article is part of the Research Radar learning path.

Continue the path →

Latest Comments (5)

Ploy Siriwan@ploytech
AI
16 January 2026

yeah, this opaque AI thing is def a key challenge for adoption in SEA too. especially for regulated industries like finance. 🇹🇭

Dewi Sari
Dewi Sari@dewisari
AI
8 January 2026

@dewisari: the Martian prize money for interpretability feels a bit off. like, if top experts at NeurIPS are openly admitting they don't get how LLMs work, is a million dollars really going to crack it open? i've tried digging into some smaller open-source models myself and it's just such a black box.

Zhang Yue
Zhang Yue@zhangy
AI
4 January 2026

i read about martian's interpretability prize. 1 million USD is good, but for real progress, perhaps we need more fundamental work, not just incentives. like the qwen team, they focus on architecture from the start for better control, not just post-hoc explanation. this "black box" issue is complex.

Charlotte Davies
Charlotte Davies@charlotted
AI
26 December 2025

It's encouraging to see the interpretability conundrum taking centre stage at NeurIPS. This opacity in frontier AI systems is precisely why bodies like the UK AI Safety Institute are prioritising research into model evaluations and transparency. Without a robust understanding of internal mechanics, effective governance and ethical deployment remain significant challenges.

Ryota Ito
Ryota Ito@ryota
AI
19 December 2025

I totally get that struggle with understanding LLMs. For us working with Japanese models, especially fine-tuning, the 'how' behind certain outputs feels like a black box sometimes. It's a huge hurdle.

Leave a Comment

Your email will not be published