TL;DR:
- Some prominent AI models struggle with EU regulations, particularly in cybersecurity and bias.
- The EU AI Act introduces fines up to €35 million or 7% of global turnover for non-compliance.
- LatticeFlow’s LLM Checker tool helps identify compliance gaps in AI models.
Artificial Intelligence (AI) is growing rapidly in Asia, with tech giants investing heavily in this transformative technology. However, as AI advances, so does the need for regulation. The European Union’s AI Act is set to shake things up, but are Asia’s tech giants ready? Let’s dive into the latest findings from LatticeFlow’s LLM Checker and explore the state of AI compliance in Asia.
The EU AI Act: A Game Changer
The EU AI Act is a comprehensive set of rules aimed at addressing the risks and challenges posed by AI. With the rise of general-purpose AI models like ChatGPT, the EU has accelerated its efforts to enforce these regulations. The AI Act covers various aspects, from cybersecurity to discriminatory output, and non-compliance can result in hefty fines.
LatticeFlow’s LLM Checker: Putting AI Models to the Test
Swiss startup LatticeFlow, in collaboration with researchers from ETH Zurich and INSAIT, has developed the LLM Checker. This tool evaluates AI models based on the EU AI Act’s criteria. The checker scored models from companies like Alibaba, Anthropic, OpenAI, Meta, and Mistral. While many models performed well overall, there were notable shortcomings in specific areas.
Discriminatory Output: A Persistent Challenge
One of the key areas where AI models struggled was discriminatory output. Reflecting human biases around gender, race, and other factors, this issue highlights the need for more inclusive and fair AI development.
- OpenAI’s GPT-3.5 Turbo scored 0.46.
- Alibaba Cloud’s Qwen1.5 72B Chat model scored 0.37.
Cybersecurity: The Battle Against Prompt Hijacking
Prompt hijacking is a type of cyberattack where hackers disguise malicious prompts as legitimate to extract sensitive information. This area also posed challenges for some models.
- Meta’s Llama 2 13B Chat model scored 0.42.
- Mistral’s 8x7B Instruct model scored 0.38.
Top Performer: Anthropic’s Claude 3 Opus
Among the models tested, Anthropic’s Claude 3 Opus stood out with the highest average score of 0.89. This model’s performance indicates that achieving high compliance with the EU AI Act is possible.
The Road to Compliance
Petar Tsankov, CEO and co-founder of LatticeFlow, sees the test results as a positive step. He believes that with a greater focus on optimising for compliance, companies can be well-prepared to meet regulatory requirements.
“The EU is still working out all the compliance benchmarks, but we can already see some gaps in the models. With a greater focus on optimising for compliance, we believe model providers can be well-prepared to meet regulatory requirements.” – Petar Tsankov, CEO, LatticeFlow
The Future of AI Regulation in Asia
As the EU AI Act comes into effect, Asian tech giants must prioritise compliance. Tools like LatticeFlow’s LLM Checker can help identify areas for improvement and guide companies towards developing more responsible AI models.
Comment and Share:
What steps is your organisation taking to ensure AI compliance with regulations like the EU AI Act? Don’t forget to subscribe. Share your insights and let’s discuss the future of AI regulation in Asia!