A Study of Using Multimodal LLMs for Non-Crash Functional Bug Detection in Android Apps
Bangyan Ju, Jin Yang, Tingting Yu, Tamerlan Abdullayev, Yuanyuan Wu,, Dingbang Wang, Yu Zhao

TL;DR
This paper empirically explores using large language models as test oracles to detect non-crash functional bugs in Android apps, demonstrating significant detection capabilities and uncovering previously unknown bugs.
Contribution
It introduces a novel application of LLMs as test oracles for NCF bug detection in Android apps, outperforming existing tools and revealing new bugs.
Findings
LLMs achieved a 49% bug detection rate.
Detected 24 previously unknown NCF bugs in 64 apps.
Identified limitations like false positives and performance issues.
Abstract
Numerous approaches employing various strategies have been developed to test the graphical user interfaces (GUIs) of mobile apps. However, traditional GUI testing techniques, such as random and model-based testing, primarily focus on generating test sequences that excel in achieving high code coverage but often fail to act as effective test oracles for non-crash functional (NCF) bug detection. To tackle these limitations, this study empirically investigates the capability of leveraging large language models (LLMs) to be test oracles to detect NCF bugs in Android apps. Our intuition is that the training corpora of LLMs, encompassing extensive mobile app usage and bug report descriptions, enable them with the domain knowledge relevant to NCF bug detection. We conducted a comprehensive empirical study to explore the effectiveness of LLMs as test oracles for detecting NCF bugs in Android…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Malware Detection Techniques · Mobile and Web Applications · Software Testing and Debugging Techniques
