Bidirectional Empowerment of Metamorphic Testing and Large Language Models: A Systematic Survey
Zheng Zheng, Zenghui Zhou, Yinwang Xu, Daixu Ren, Tsong Yueh Chen

TL;DR
This systematic survey explores the reciprocal enhancement between metamorphic testing and large language models, highlighting their combined potential to improve AI testing and validation methods.
Contribution
It introduces a taxonomy of how metamorphic testing and LLMs mutually empower each other, covering verification, relation discovery, and automated testing techniques.
Findings
Reviewed 93 primary studies on MT and LLMs
Proposed a taxonomy of bidirectional empowerment
Identified future directions for AI quality assurance
Abstract
Large language models (LLMs) have introduced substantial challenges to software quality assurance due to their generative, probabilistic, and open-ended nature, which intensifies the oracle problem and limits the applicability of traditional testing methods. Metamorphic testing (MT), which checks necessary relations among multiple related executions rather than relying on exact expected outputs, has emerged as a promising approach for testing LLMs and other oracle-deficient systems. At the same time, the strong semantic understanding, reasoning, and code generation capabilities of LLMs create new opportunities to automate the traditionally labor-intensive phases of MT. This survey systematically reviews 93 primary studies and characterizes this reciprocal relationship as the bidirectional empowerment of MT and LLMs. We propose a taxonomy spanning two complementary directions: MT for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
