How Multi-Modal LLMs Reshape Visual Deep Learning Testing? A Comprehensive Study Through the Lens of Image Mutation
Liwen Wang, Yuanyuan Yuan, Ao Sun, Zongjie Li, Pingchuan Ma, Daoyuan, Wu, Shuai Wang

TL;DR
This study explores how multi-modal large language models can generate diverse, high-quality image mutations for visual deep learning testing, revealing their strengths and limitations compared to traditional methods.
Contribution
It provides the first comprehensive evaluation of MLLM-based image mutations for VDL testing, highlighting their potential to expand mutation semantics and complement traditional approaches.
Findings
MLLMs generate high-quality, semantically rich image mutations.
Traditional mutations like rotation are less supported by MLLMs.
Combining MLLM-based and traditional mutations enhances testing coverage.
Abstract
Visual deep learning (VDL) systems have shown significant success in real-world applications like image recognition, object detection, and autonomous driving. To evaluate the reliability of VDL, a mainstream approach is software testing, which requires diverse mutations over image semantics. The rapid development of multi-modal large language models (MLLMs) has introduced revolutionary image mutation potentials through instruction-driven methods. Users can now freely describe desired mutations and let MLLMs generate the mutated images. Hence, parallel to large language models' (LLMs) recent success in traditional software fuzzing, one may also expect MLLMs to be promising for VDL testing in terms of offering unified, diverse, and complex image mutations. However, the quality and applicability of MLLM-based mutations in VDL testing remain largely unexplored. We present the first study,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Domain Adaptation and Few-Shot Learning · Machine Learning and Data Classification
