Skip to main content

Cookie Consent

We use cookies to enhance your browsing experience, serve personalised ads or content, and analyse our traffic. Learn more

Install AIinASIA

Get quick access from your home screen

AI in ASIA
AI gaming revolution
Life

MarioVGG Simulates Super Mario Bros. from Video Footage

MarioVGG, a new AI model, simulates Super Mario Bros. from video footage, showcasing the potential for AI to revolutionise gaming.

Intelligence Desk3 min read

MarioVGG is a new AI model that can generate plausible video of Super Mario Bros. from user inputs.,The model was trained on over 737,000 frames of Mario gameplay.,Despite limitations, MarioVGG shows potential for AI to replace game engines in the future.

The Future of Gaming: AI-Generated Video

Imagine playing your favourite video game without a traditional game engine. Instead, an AI model generates the gameplay based on video footage. This is the fascinating concept behind MarioVGG, a new AI model that simulates Super Mario Bros. from video data. Developed by researchers from Virtuals Protocol, MarioVGG represents a significant step towards AI-generated video games.

Training MarioVGG: A Massive Undertaking

To train MarioVGG, the researchers used a public dataset containing 280 levels of Super Mario Bros. gameplay. This dataset included over 737,000 individual frames, which were preprocessed into 35-frame chunks. The model focused on two inputs: "run right" and "run right and jump." Even with these limitations, training the model took about 48 hours on a single RTX 4090 graphics card.

How MarioVGG Works

MarioVGG uses a standard convolution and denoising process to generate new frames of video from a static starting game image and a text input. The model can create gameplay videos of any length by using the last frame of one sequence as the first frame of the next. This results in "coherent and consistent gameplay," according to the researchers. For a deeper dive into video generation, you might explore our guide on Beginner's Guide to Using Sora AI Video.

Challenges and Limitations

Despite its impressive capabilities, MarioVGG has several limitations. The model downscales the output frames to a resolution of 64×48, much lower than the NES's 256×240 resolution. It also condenses 35 frames of video into just seven generated frames, resulting in rougher-looking gameplay. Additionally, MarioVGG struggles to approach real-time video generation, taking six seconds to generate a six-frame video sequence. This highlights some of the ongoing challenges in running out of data: the strange problem behind AI's next bottleneck.

Impressive Results Despite Limitations

Even with these limitations, MarioVGG can create passably believable video of Mario running and jumping. The model can infer game physics, such as Mario falling when he runs off a cliff and halting his forward motion when adjacent to an obstacle. MarioVGG can also hallucinate new obstacles for Mario, although these can't be influenced by user prompts. The ability of AI to generate creative content continues to evolve, as seen with OpenAI adds reusable ‘characters’ and video stitching to Sora.

The Future of AI in Gaming

The researchers hope that MarioVGG represents a first step towards "producing and demonstrating a reliable and controllable video game generator." They even suggest that AI models like MarioVGG could one day replace game development and game engines completely. This echoes broader discussions about AI's impact on various industries, including whether AI & Call Centres: Is The End Nigh?. The potential for AI to transform creative fields is immense, as detailed in this research paper on AI in game design.

Comment and Share:

What do you think about the future of AI in gaming? Could AI models like MarioVGG really replace traditional game engines? Share your thoughts and experiences in the comments below. Don't forget to Subscribe to our newsletter for updates on AI and AGI developments.

What did you think?

Written by

Share your thoughts

Join 4 readers in the discussion below

This is a developing story

We're tracking this across Asia-Pacific and may update with new developments, follow-ups and regional context.

This article is part of the AI Video & Audio learning path.

Continue the path →

Liked this? There's more.

Join our weekly newsletter for the latest AI news, tools, and insights from across Asia. Free, no spam, unsubscribe anytime.

Latest Comments (4)

Soo-yeon Park
Soo-yeon Park@sooyeon
AI
11 February 2026

wow 737,000 frames is a lot for just two inputs, "run right" and "run right and jump." makes me think about how much data would be needed to simulate something more complex, like a k-drama scene with different character interactions and dialogue. how much bigger would the dataset need to be for a more nuanced world?

Elaine Ng
Elaine Ng@elaineng
AI
3 December 2024

I'm thinking about the implications for game preservation, especially with so many older titles in Asia that might not have easily accessible source code, if all you need is footage.

Pierre Dubois
Pierre Dubois@pierred
AI
29 October 2024

@pierred It's interesting to see MarioVGG and the whole idea of replacing game engines with AI-generated video. From a research perspective, while the 737,000 frames is a substantial dataset for a single game, the downscaled 64x48 resolution and the 35 frames compressed to seven generated frames highlight a persistent challenge in generative models for video. This reminds me of some of the work we see coming out of institutions like EPFL, focusing on ultra-low latency inference for high-fidelity video generation. The "coherent and consistent gameplay" is a good step, but the path to truly actionable, high-resolution, real-time interactive environments via pure generation, sans engine, remains a significant hurdle. C'est la vie, these models are still in their infancy.

Somchai Wongsa@somchaiw
AI
22 October 2024

The mention of using 737,000 frames for training, even for a classic like Mario, highlights the significant data requirements. This is something we are actively considering in our discussions around building shared digital infrastructure within the ASEAN digital economy framework. I'm keen to see if this kind of model could be adapted for public service simulations.

Leave a Comment

Your email will not be published