Model-Enhanced LLM-Driven VUI Testing of VPA Apps
Suwan Li, Lei Bu, Guangdong Bai, Fuman Xie, Kai Chen, Chang Yue

TL;DR
Elevate is a novel framework that uses large language models to enhance the testing of voice assistant apps by constructing behavior models and generating effective test inputs, leading to higher coverage and efficiency.
Contribution
This work introduces Elevate, a model-enhanced LLM-driven VUI testing framework that improves state space coverage and testing efficiency for VPA apps.
Findings
Achieves 15% higher state coverage than Vitas on Alexa skills.
Effectively constructs behavior models from app outputs.
Significantly improves testing efficiency.
Abstract
The flourishing ecosystem centered around voice personal assistants (VPA), such as Amazon Alexa, has led to the booming of VPA apps. The largest app market Amazon skills store, for example, hosts over 200,000 apps. Despite their popularity, the open nature of app release and the easy accessibility of apps also raise significant concerns regarding security, privacy and quality. Consequently, various testing approaches have been proposed to systematically examine VPA app behaviors. To tackle the inherent lack of a visible user interface in the VPA app, two strategies are employed during testing, i.e., chatbot-style testing and model-based testing. The former often lacks effective guidance for expanding its search space, while the latter falls short in interpreting the semantics of conversations to construct precise and comprehensive behavior models for apps. In this work, we introduce…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsModel-Driven Software Engineering Techniques · Mobile and Web Applications · Software Testing and Debugging Techniques
