News

The Skeleton Key AI Jailbreak Technique Unveiled

The Skeleton Key AI jailbreak technique poses a threat to AI security.

Published

11 months ago

July 1, 2024

AIinAsia

TL;DR:

Microsoft uncovers a new AI jailbreak technique called Skeleton Key, capable of bypassing safety guardrails in multiple AI models.
Prominent AI models, including GPT-3.5 Turbo and GPT-4, are vulnerable to this technique.
Microsoft proposes a multi-layered approach to counter the threat, including input filtering, prompt engineering, and output filtering.

The AI Threat You Need to Know: The Skeleton Key Jailbreak Technique

Artificial intelligence (AI) is transforming industries and revolutionising the way we live. However, recent findings by Microsoft researchers have uncovered a new threat: the Skeleton Key AI jailbreak technique. This technique can bypass safety guardrails in multiple generative AI models, potentially allowing attackers to extract harmful or restricted information.

What is the Skeleton Key Technique?

The Skeleton Key technique manipulates AI models into ignoring their built-in safety protocols using a multi-turn strategy. It works by instructing the model to augment its behaviour guidelines rather than changing them outright. This approach, known as “Explicit: forced instruction-following,” effectively narrows the gap between what the model is capable of doing and what it is willing to do. Once successful, the attacker gains complete control over the AI’s output.

Affected AI Models

Testing conducted by Microsoft revealed that several prominent AI models were vulnerable to the Skeleton Key jailbreak technique. These models include Meta’s Llama3-70b-instruct, Google’s Gemini Pro, OpenAI’s GPT-3.5 Turbo and GPT-4, Mistral Large, Anthropic’s Claude 3 Opus, and Cohere’s Commander R Plus. When subjected to the Skeleton Key attack, these models complied fully with requests across various risk categories.

Mitigation Strategies

To counter the Skeleton Key jailbreak threat, Microsoft recommends a multi-layered approach for AI system designers. This includes implementing input filtering to detect and block potentially harmful inputs, careful prompt engineering of system messages to reinforce appropriate behaviour, and output filtering to prevent the generation of content that breaches safety criteria. Additionally, abuse monitoring systems trained on adversarial examples should be employed to detect and mitigate recurring problematic content or behaviours.

Significance and Challenges

The discovery of the Skeleton Key jailbreak technique underscores the ongoing challenges in securing AI systems as they become more prevalent. This vulnerability highlights the critical need for robust security measures across all layers of the AI stack. While the impact is limited to manipulating the model’s outputs rather than accessing user data or taking control of the system, the technique’s ability to bypass multiple AI models’ safeguards raises concerns about the effectiveness of current responsible AI guidelines.

Protect Your AI

To protect your AI from potential jailbreaks, consider implementing Microsoft’s recommended multi-layered approach. This includes input filtering, prompt engineering, output filtering, and abuse monitoring systems.

What steps are you taking to protect your AI systems from emerging threats like the Skeleton Key jailbreak technique? Share your thoughts below and don’t forget to subscribe for updates on AI and AGI developments.

You may also like:

The Risks and Rewards of Using AI in Wargame Simulations
AI Risk Management: Navigating the Opportunities and Challenges in Asia
The Emergence of AI Worms: A New Cybersecurity Threat in Asia
To learn more about the Skeleton Key tap here.

Author

AIinAsia

View all posts

Discover more from AIinASIA

Subscribe to get the latest posts sent to your email.

Related Topics:abuse monitoring AI models AI security Featured input filtering jailbreak technique output filtering safety guardrails

Up Next

Apple’s Revolutionary Intelligence Suite Comes to iPhones and Macs

Don't Miss

Revolutionising AI Image Generation: Ambient Diffusion and the Copyright Conundrum

Click to comment

News

If AI Kills the Open Web, What’s Next?

Exploring how AI is transforming the open web, the rise of agentic AI, and emerging monetisation models like microtransactions and stablecoins.

Published

1 week ago

May 28, 2025

AIinAsia

The web is shifting from human-readable pages to machine-mediated experiences with AI impacting the future of the open web. What comes next may be less open—but potentially more useful.

TL;DR — What You Need To Know

AI is reshaping web navigation: Google’s AI Overviews and similar tools provide direct answers, reducing the need to visit individual websites.
Agentic AI is on the rise: Autonomous AI agents are beginning to perform tasks like browsing, shopping, and content creation on behalf of users.
Monetisation models are evolving: Traditional ad-based revenue is declining, with microtransactions and stablecoins emerging as alternative monetisation methods.
The open web faces challenges: The shift towards AI-driven interactions threatens the traditional open web model, raising concerns about content diversity and accessibility.

The Rise of Agentic AI

The traditional web, characterised by human users navigating through hyperlinks and search results, is undergoing a transformation. AI-driven tools like Google’s AI Overviews now provide synthesised answers directly on the search page, reducing the need for users to click through to individual websites.

This shift is further amplified by the emergence of agentic AI—autonomous agents capable of performing tasks such as browsing, shopping, and content creation without direct human intervention. For instance, Opera’s new AI browser, Opera Neon, can automate internet tasks using contextual awareness and AI agents.

These developments suggest a future where AI agents act as intermediaries between users and the web, fundamentally altering how information is accessed and consumed.

Monetisation in the AI Era

The traditional ad-based revenue model that supported much of the open web is under threat. As AI tools provide direct answers, traffic to individual websites declines, impacting advertising revenues.

In response, new monetisation strategies are emerging. Microtransactions facilitated by stablecoins offer a way for users to pay small amounts for content or services, enabling creators to earn revenue directly from consumers. Platforms like AiTube are integrating blockchain-based payments, allowing creators to receive earnings through stablecoins across multiple protocols.

This model not only provides a potential revenue stream for content creators but also aligns with the agentic web’s emphasis on seamless, automated interactions.

The Future of the Open Web

The open web, once a bastion of free and diverse information, is facing significant challenges. The rise of AI-driven tools and platforms threatens to centralise information access, potentially reducing the diversity of content and perspectives available to users.

However, efforts are underway to preserve the open web’s principles. Initiatives like Microsoft’s NLWeb aim to create open standards that allow AI agents to access and interact with web content in a way that maintains openness and interoperability.

The future of the web may depend on balancing the efficiency and convenience of AI-driven tools with the need to maintain a diverse and accessible information ecosystem.

What Do YOU Think?

As AI impacts the future of the open web, we must consider how to preserve the values of openness, diversity, and accessibility. How can we ensure that the web remains a space for all voices, even as AI agents become the primary means of navigation and interaction?

You may also like:

Top 10 AI Trends Transforming Asia by 2025
Build Your Own Agentic AI — No Coding Required
Is AI Really Paying Off? CFOs Say ‘Not Yet’
Or tap here to explore the free version of Claude AI.

Author

AIinAsia

View all posts

Discover more from AIinASIA

Subscribe to get the latest posts sent to your email.

News

GPT-5 Is Less About Revolution, More About Refinement

This article explores OpenAI’s development of GPT-5, focusing on improving user experience by unifying AI tools and reducing the need for manual model switching. It includes insights from VP of Research Jerry Tworek on token growth, benchmarks, and the evolving role of humans in the AI era.

Published

2 weeks ago

May 22, 2025

AIinAsia

OpenAI’s next model isn’t chasing headlines—it’s building a smoother, smarter user experience with fewer interruptions the launch of GPT-5 unified tools.

TL;DR — What You Need To Know

GPT-5 aims to unify OpenAI’s tools, reducing the need for switching between models
The Operator screen agent is due for an upgrade, with a push towards becoming a desktop-level assistant
Token usage continues to rise, suggesting growing AI utility and infrastructure demand
Benchmarks are losing their relevance, with real-world use cases taking centre stage
OpenAI believes AI won’t replace humans but may reshape human labour roles

A more cohesive AI experience, not a leap forward

While GPT-4 dazzled with its capabilities, GPT-5 appears to be a quieter force, according to OpenAI’s VP of Research, Jerry Tworek. Speaking during a recent Reddit Q&A with the Codex team, Tworek described the new model as a unifier—not a disruptor.

“We just want to make everything our models can currently do better and with less model switching,” Tworek said. That means streamlining the experience so users aren’t constantly toggling between tools like Codex, Operator, Deep Research and memory functions.

For OpenAI, the future lies in integration over invention. Instead of introducing radically new features, GPT-5 focuses on making the existing stack work together more fluidly. This approach marks a clear departure from the hype-heavy rollouts often associated with new model versions.

Operator: from browser control to desktop companion

One of the most interesting pieces in this puzzle is Operator, OpenAI’s still-experimental screen agent. Currently capable of basic browser navigation, it’s more novelty than necessity. But that may soon change.

An update to Operator is expected “soon,” with Tworek hinting it could evolve into a “very useful tool.” The goal? A kind of AI assistant that handles your screen like a power user, automating online tasks without constantly needing user prompts.

The update is part of a broader push to make AI tools feel like one system, rather than a toolkit you have to learn to assemble. That shift could make screen agents like Operator truly indispensable—especially in Asia, where mobile-first behaviour and app fragmentation often define the user journey.

Integration efforts hit reality checks

Originally, OpenAI promised that GPT-5 would merge the GPT and “o” model series into a single omnipotent system. But as with many grand plans in AI, the reality was less elegant.

In April, CEO Sam Altman admitted the challenge: full integration proved more complex than expected. Instead, the company released o3 and o4-mini as standalone models, tailored for reasoning.

Tworek confirmed that the vision of reduced model switching is still alive—but not at the cost of model performance. Users will still see multiple models under the hood; they just might not have to choose between them manually.

Tokens and the long road ahead

If you think the token boom is a temporary blip, think again. Tworek addressed a user scenario where AI assistants might one day process 100 tokens per second continuously, reading sensors, analysing messages, and more.

That, he says, is entirely plausible. “Even if models stopped improving,” Tworek noted, “they could still deliver a lot of value just by scaling up.”
Author Name

This perspective reflects a strategic bet on infrastructure. OpenAI isn’t just building smarter models; it’s betting on broader usage. Token usage becomes a proxy for economic value—and infrastructure expansion the necessary backbone.

Goodbye benchmarks, hello real work

When asked to compare GPT with rivals like Claude or Gemini, Tworek took a deliberately contrarian stance. Benchmarks, he suggested, are increasingly irrelevant.

“They don’t reflect how people actually use these systems,” he explained, noting that many scores are skewed by targeted fine-tuning.

Instead, OpenAI is doubling down on real-world tasks as the truest test of model performance. The company’s ambition? To eliminate model choice altogether. “Our goal is to resolve this decision paralysis by making the best one.”

The human at the helm

Despite AI’s growing power, Tworek offered a thoughtful reminder: some jobs will always need humans. While roles will evolve, the need for oversight won’t go away.

“In my view, there will always be work only for humans to do,” he said. The “last job,” he suggested, might be supervising the machines themselves—a vision less dystopian, more quietly optimistic.

For Asia’s fast-modernising economies, that might be a signal to double down on education, critical thinking, and human-centred design. The jobs of tomorrow may be less about doing, and more about directing.

Author

AIinAsia

View all posts

Discover more from AIinASIA

Subscribe to get the latest posts sent to your email.

Business

Apple’s China AI pivot puts Washington on edge

Apple’s partnership with Alibaba to deliver AI services in China has sparked concern among U.S. lawmakers and security experts, highlighting growing tensions in global technology markets.

Published

2 weeks ago

May 21, 2025

AIinAsia

As Apple courts Alibaba for its iPhone AI partnership in China, U.S. lawmakers see more than just a tech deal taking shape.

TL;DR — What You Need To Know

Apple has reportedly selected Alibaba’s Qwen AI model to power its iPhone features in China
U.S. lawmakers and security officials are alarmed over data access and strategic implications
The deal has not been officially confirmed by Apple, but Alibaba’s chairman has acknowledged it
China remains a critical market for Apple amid declining iPhone sales
The partnership highlights the growing difficulty of operating across rival tech spheres

Apple Intelligence meets the Great Firewall

Apple’s strategic pivot to partner with Chinese tech giant Alibaba for delivering AI services in China has triggered intense scrutiny in Washington. The collaboration, necessitated by China’s blocking of OpenAI services, raises profound questions about data security, technological sovereignty, and the intensifying tech rivalry between the United States and China. As Apple navigates declining iPhone sales in the crucial Chinese market, this partnership underscores the increasing difficulty for multinational tech companies to operate seamlessly across divergent technological and regulatory environments.

Apple Intelligence Meets Chinese Regulations

When Apple unveiled its ambitious “Apple Intelligence” system in June, it marked the company’s most significant push into AI-enhanced services. For Western markets, Apple seamlessly integrated OpenAI’s ChatGPT as a cornerstone partner for English-language capabilities. However, this implementation strategy hit an immediate roadblock in China, where OpenAI’s services remain effectively banned under the country’s stringent digital regulations.

Faced with this market-specific challenge, Apple initiated discussions with several Chinese AI leaders to identify a compliant local partner capable of delivering comparable functionality to Chinese consumers. The shortlist reportedly included major players in China’s burgeoning AI sector:

Baidu, known for its Ernie Bot AI system
DeepSeek, an emerging player in foundation models
Tencent, the social media and gaming powerhouse
Alibaba, whose open-source Qwen model has gained significant attention

While Apple has maintained its characteristic silence regarding partnership details, recent developments strongly suggest that Alibaba’s Qwen model has emerged as the chosen solution. The arrangement was seemingly confirmed when Alibaba’s chairman made an unplanned reference to the collaboration during a public appearance.

“Apple’s decision to implement a separate AI system for the Chinese market reflects the growing reality of technological bifurcation between East and West. What we’re witnessing is the practical manifestation of competing digital sovereignty models.”
Doctor Emily Zhang, Technology Policy Researcher at Stanford University
Tweet

Washington’s Mounting Concerns

The revelation of Apple’s China-specific AI strategy has elicited swift and pronounced reactions from U.S. policymakers. Members of the House Select Committee on China have raised alarms about the potential implications, with some reports indicating that White House officials have directly engaged with Apple executives on the matter.

Representative Raja Krishnamoorthi of the House Intelligence Committee didn’t mince words, describing the development as “extremely disturbing.” His reaction encapsulates broader concerns about American technological advantages potentially benefiting Chinese competitors through such partnerships.

Greg Allen, Director of the Wadhwani A.I. Centre at CSIS, framed the situation in competitive terms:

“The United States is in an AI race with China, and we just don’t want American companies helping Chinese companies run faster.”

The concerns expressed by Washington officials and security experts include:

Data Sovereignty Issues: Questions about where and how user data from AI interactions would be stored, processed, and potentially accessed
Model Training Advantages: Concerns that the vast user interactions from Apple devices could help improve Alibaba’s foundational AI models
National Security Implications: Worries about whether sensitive information could inadvertently flow through Chinese servers
Regulatory Compliance: Questions about how Apple will navigate China’s content restrictions and censorship requirements

In response to these growing concerns, U.S. agencies are reportedly discussing whether to place Alibaba and other Chinese AI companies on a restricted entity list. Such a designation would formally limit collaboration between American and Chinese AI firms, potentially derailing arrangements like Apple’s reported partnership.

Commercial Necessities vs. Strategic Considerations

Apple’s motivation for pursuing a China-specific AI solution is straightforward from a business perspective. China remains one of the company’s largest and most important markets, despite recent challenges. Earlier this spring, iPhone sales in China declined by 24% year over year, highlighting the company’s vulnerability in this critical market.

Without a viable AI strategy for Chinese users, Apple risks further erosion of its market position at precisely the moment when AI features are becoming central to consumer technology choices. Chinese competitors like Huawei have already launched their own AI-enhanced smartphones, increasing pressure on Apple to respond.

“Apple faces an almost impossible balancing act. They can’t afford to offer Chinese consumers a second-class experience by omitting AI features, but implementing them through a Chinese partner creates significant political exposure in the U.S.
Michael Chen, Technology Analyst at Global Market Insights
Tweet

The situation is further complicated by China’s own regulatory environment, which requires foreign technology companies to comply with data localisation rules and content restrictions. These requirements effectively necessitate some form of local partnership for AI services.

A Blueprint for the Decoupled Future?

Whether Apple’s partnership with Alibaba proceeds as reported or undergoes modifications in response to political pressure, the episode provides a revealing glimpse into the fragmenting global technology landscape.

As digital ecosystems increasingly align with geopolitical boundaries, multinational technology firms face increasingly complex strategic decisions:

Regionalised Technology Stacks: Companies may need to develop and maintain separate technological implementations for different markets
Partnership Dilemmas: Collaborations beneficial in one market may create political liabilities in others
Regulatory Navigation: Operating across divergent regulatory environments requires sophisticated compliance strategies
Resource Allocation: Developing market-specific solutions increases costs and complexity

What we’re seeing with Apple and Alibaba may become the norm rather than the exception. The era of frictionless global technology markets is giving way to one where regional boundaries increasingly define technological ecosystems.
Doctor Sarah Johnson, Geopolitical Risk Consultant
Tweet

Looking Forward

For now, Apple Intelligence has no confirmed launch date for the Chinese market. However, with new iPhone models traditionally released in autumn, Apple faces mounting time pressure to finalise its AI strategy.

The company’s eventual approach could signal broader trends in how global technology firms navigate an increasingly bifurcated digital landscape. Will companies maintain unified global platforms with minimal adaptations, or will we see the emergence of fundamentally different technological experiences across major markets?

As this situation evolves, it highlights a critical reality for the technology sector: in an era of intensifying great power competition, even seemingly routine business decisions can quickly acquire strategic significance.

You May Also Like: