Scaling Mobile Chaos Testing with AI-Driven Test Execution
Juan Marcano, Ashish Samant, Kai Song, Lingchao Chen, Kaelan Mikowicz, Tim Smyth, Mengdie Zhang, Ali Zamani, Arturo Bravo Rovirosa, Sowjanya Puligadda, Srikanth Prodduturi, and Mayank Bansal

TL;DR
This paper introduces an AI-driven mobile chaos testing system that automates and scales resilience validation for large mobile applications, identifying critical failure points with minimal manual effort.
Contribution
It presents a novel integration of LLM-based testing with fault injection, enabling scalable, automated resilience testing for mobile apps at production scale.
Findings
Executed over 180,000 chaos tests across Uber apps
Identified 23 resilience risks, including critical failures
Achieved 88% precision in root cause attribution
Abstract
Mobile applications in large-scale distributed systems are susceptible to backend service failures, yet traditional chaos engineering approaches cannot scale mobile testing due to the combinatorial explosion of flows, locations, and failure scenarios that need validation. We present an automated mobile chaos testing system that integrates DragonCrawl, an LLM-based mobile testing platform, with uHavoc, a service-level fault injection system. The key insight is that adaptive AI-driven test execution can navigate mobile applications under degraded backend conditions, eliminating the need to manually write test cases for each combination of user flow, city, and failure type. Since Q1 2024, our system has executed over 180,000 automated chaos tests across 47 critical flows in Uber's Rider, Driver, and Eats applications, representing approximately 39,000 hours of manual testing effort that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware System Performance and Reliability · Software Testing and Debugging Techniques · Distributed systems and fault tolerance
