Streamlining Acceptance Test Generation for Mobile Applications Through Large Language Models: An Industrial Case Study
Pedro Lu\'is Fonseca, Bruno Lima, Jo\~ao Pascoal Faria

TL;DR
This paper presents AToMIC, an automated framework using Large Language Models to generate acceptance testing artifacts for mobile apps, significantly reducing manual effort and increasing efficiency in industrial settings.
Contribution
We introduce AToMIC, a novel LLM-based framework that automates acceptance test generation from requirements and code changes for mobile applications.
Findings
93.3% of Gherkin scenarios were syntactically correct
78.8% of PageObjects ran without manual edits
100% of generated UI tests executed successfully
Abstract
Mobile acceptance testing remains a bottleneck in modern software development, particularly for cross-platform mobile development using frameworks like Flutter. While developers increasingly rely on automated testing tools, creating and maintaining acceptance test artifacts still demands significant manual effort. To help tackle this issue, we introduce AToMIC, an automated framework leveraging specialized Large Language Models to generate Gherkin scenarios, Page Objects, and executable UI test scripts directly from requirements (JIRA tickets) and recent code changes. Applied to BMW's MyBMW app, covering 13 real-world issues in a 170+ screen codebase, AToMIC produced executable test artifacts in under five minutes per feature on standard hardware. The generated artifacts were of high quality: 93.3% of Gherkin scenarios were syntactically correct upon generation, 78.8% of PageObjects ran…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
