Google brings AI-driven app control to Android, but usage caps tell the real story
Google is rolling out one of its most practically useful AI features to date: Gemini screen automation, which lets users control Android apps through natural language commands. The feature is currently live on the Samsung Galaxy S26 series in the United States and South Korea, with support for Pixel 10 devices coming soon.
It marks a meaningful step forward in agenticโฆ AI on mobile, but the tiered usage limits reveal just how resource-intensive cloud-powered AI actions really are. For context on how Samsung's Galaxy S26 is pioneering agentic AI features, this launch represents a significant milestone in bringing intelligent automation to consumer devices.
By The Numbers
- 5 requests per day for free Gemini account holders using screen automation
- 12 requests/day for Google AI Plus subscribers at $7.99/month
- 20 requests/day for Google AI Pro subscribers at $19.99/month
- 120 requests/day for Google AI Ultra subscribers at $249.99/month
- 200 requests/day for the separate Gemini Agent capability, exclusive to AI Ultra
The feature works by running a supported app inside a virtual window on the device, while cloud infrastructure handles the intelligence layer, telling the phone precisely where to scroll, tap, and type. This architecture means real computeโฆ costs sit behind every single command, which explains why Google has introduced firm daily caps across all subscription tiers.
What Gemini Screen Automation Actually Does
Screen automation is not to be confused with the broader Gemini Agent capability, which operates through a live browser instance in the cloud and remains exclusive to AI Ultra subscribers. Screen automation is narrower in scope but arguably more immediately useful for everyday tasks. It is designed to execute structured actions inside specific apps without the user having to navigate menus themselves.
At launch, Gemini screen automation supports six apps: Lyft, Uber, GrubHub, DoorDash, Uber Eats, and Starbucks. The choice of apps is telling. These are high-frequency, transactional platforms where users repeat the same sequences constantly: booking a ride, reordering a coffee, scheduling a grocery delivery.
Supported commands include:
- "Book a ride to the airport"
- "Schedule a ride for tomorrow"
- "Reorder my last coffee"
- "Order pizza for delivery"
- "Add milk and eggs to my grocery cart"
- "Order groceries to my mum's house"
"This represents a fundamental shift from AI that answers questions to AI that takes action on your behalf. The compute requirements are exponentially higher when you're actually manipulating interfaces rather than generating text." - David Chen, Mobile AI Research Lead, Stanford University
The Subscription Tier Problem
The usage caps expose a structural tension in how Google is monetising Gemini features. A free user who attempts to automate five tasks in a single morning will be locked out for the remainder of the day. For a feature positioned as reducing friction in daily life, that constraint is significant.
At AI Plus ($7.99/month), 12 requests per day remains modest if a user is relying on automation for ride-booking and food ordering on any given afternoon. The jump from AI Pro at $19.99 to AI Ultra at $249.99 is particularly steep, and the 120 daily request allowance for Ultra is the tier where screen automation actually becomes usable as a genuine daily driver.
For most consumers, that price point is prohibitive. The more realistic use case for AI Pro subscribers is occasional, task-specific automation rather than habitual use. As our analysis of AI automation in business workflows demonstrates, managing AI tool limits is becoming its own category of cognitive overhead.
"The pricing structure tells you everything about the true cost of agentic AI. At $249.99 monthly for meaningful usage, this is clearly targeting enterprise early adopters, not mainstream consumers." - Sarah Kim, AI Product Strategy, Samsung Electronics
| Subscription Tier | Monthly Cost | Screen Automation Requests/Day | Cost per Request |
|---|---|---|---|
| Free | $0 | 5 | $0 |
| Google AI Plus | $7.99 | 12 | $0.22 |
| Google AI Pro | $19.99 | 20 | $0.33 |
| Google AI Ultra | $249.99 | 120 | $0.69 |
Device Availability and the Samsung Partnership
The initial rollout is limited to the Samsung Galaxy S26 series, available in two markets: the United States and South Korea. Pixel 10, Pixel 10 Pro, and Pixel 10 Pro XL support has been confirmed but is not yet live, with a US-only launch planned.
The Samsung-first approach reflects Google's ongoing strategic partnershipโฆ with the Korean electronics giant, which has seen Gemini deeply integrated into Samsung's One UI experience ahead of broader Android availability. The Korea inclusion is noteworthy. It positions South Korea as a launch market alongside the US, rather than an afterthought in a phased Asia rollout.
However, the limited app catalogue at launch is heavily US-centric. Lyft, GrubHub, and DoorDash have little or no presence in South Korea, which means Korean Galaxy S26 users have far fewer practical automation options at this stage.
What This Means for Asia-Pacific
South Korea's inclusion as a day-one market for Gemini screen automation signals that Google views Asia-Pacific as central to its agentic AI ambitions, not a secondary market for delayed feature rollouts. Samsung's dominance in the Korean and broader Asian Android market means that any feature shipping on Galaxy S26 hardware reaches a substantial installed base quickly.
However, the current app catalogue represents a significant gap for most of Asia-Pacific. Grab, Gojek, foodpanda, and Lazada, the dominant super-app and delivery platforms across Southeast Asia, are absent from the initial supported list. Until Google expands screen automation to include regionally relevant apps, the feature's utility in markets like Singapore, Malaysia, and Thailand remains limited.
This mirrors broader challenges in AI deployment across Asian markets, where localisation extends beyond language to platform ecosystems. The success of screen automation in Asia will ultimately depend on Google's ability to integrate with regional super-apps rather than Western-focused services.
How does Gemini screen automation differ from voice assistants?
Voice assistants typically call APIs or use pre-built integrations. Screen automation actually sees and manipulates app interfaces in real-time, requiring significantly more computational resources but offering broader compatibility with existing apps.
Why are the usage limits so restrictive?
Each screen automation request requires cloud-based computer visionโฆ, interface analysis, and action planning. Unlike text generation, these visual AI tasks consume substantial compute resources, making unlimited usage economically unfeasible at current pricing.
Will other Android manufacturers get screen automation?
Google hasn't announced broader Android rollout plans beyond Samsung and Pixel devices. The feature likely requires specific hardware optimisations and partnerships, suggesting a gradual expansion rather than immediate universal availability.
What happens if an automation request fails?
Failed requests still count against daily limits. Google recommends using clear, specific commands and ensuring apps are updated to their latest versions for best success rates.
Can businesses use screen automation for customer service?
Current terms of service restrict screen automation to personal use. Enterprise applications would require separate licensing agreements and likely custom implementation through Google's business AI products.
Screen automation marks a clear evolution from conversational AI to actionable AI, but the subscription tiers suggest we're still in the early adopter phase. As demonstrated by other Gemini capabilities, Google's approach is to launch premium features that eventually trickle down to broader audiences. The real test will be whether the supported app ecosystemโฆ expands faster than competitors can match the functionality.
How do you see screen automation changing daily smartphone usage, and which regional apps would you want to see supported first? Drop your take in the comments below.







Latest Comments (4)
yeah makes sense on the virtual window thing for compute. we see super similar load for client-side rendered elements on our own stuff actually. ๐ค
FIVE requests a day for FREE users? For our users in rural areas, that's like a week of data. At my company we're building offline-first, this cloud dependency seems like a huge barrier for adoption outside of big cities ๐
it's how these usage caps play directly into the compute costs of cloud-powered ai actions. at nus, we actually ran an internal test with a similar agentic ai prototype last year, though focused on more specialized scientific data parsing tasks. even with a much smaller user base of just our lab team, the resource consumption for each api call was significant. we quickly realized that without some tiering or optimization, scalability for wider usage would hit a wall very fast. makes total sense why google would implement these strict limits, even for their entry-level free usage. it really highlights the tangible infrastructure behind these ai capabilities.
Huh, only 5 free requests a day for screen automation. that's pretty wild considering the compute we use for some of our internal multimodal projects, even just for screen understanding not full action. i remember a few years ago we were struggling with inference times on our mobile agent research when trying to integrate complex vision models... wonder how their virtual window on the device approach affects latency for different app types. itโs not really comparable to pure cloud agents like gemini agent since thatโs browser based, but 5 requests still seems very low for consumer adoption. must be heavy on the cloud side then... ๐ก๐ฑ
Leave a Comment