MuMuTestUp: Mutation-based Multi-Agent Test Case Update
Dawei Tian (1), Jiakun Liu (1), Yun Peng (2), Yichen Zhang (1), Jianlei Chi (3), Jun Sun (4), Xiaohong Su (1) ((1) Harbin Institute of Technology, (2) The Chinese University of Hong Kong, (3) Xidian University, (4) Singapore Management University)

TL;DR
MuMuTestUp is a mutation-guided multi-agent framework that improves automatic test case updates by addressing assertion adequacy, precise coverage, and hallucination issues using specialized agents.
Contribution
It introduces a novel mutation-based multi-agent approach with three specialized agents and a new dataset for effective test case updating in evolving software.
Findings
Outperforms state-of-the-art baselines in test update tasks
Effectively improves assertion adequacy and coverage precision
Handles hallucinations with semantic retrieval
Abstract
Modern software systems evolve rapidly under CI/CD practices, where tests are critical for quality. However, substantial code changes often render existing test cases obsolete, causing pipeline disruptions, reduced productivity, and compromised quality. Recent automatic test update approaches leverage LLMs to refine test cases via execution feedback and exact-matching context retrieval, prioritizing executability and line coverage but suffering three limitations: (1) neglecting test assertion adequacy, weakening fault detection; (2) relying on coarse line coverage instead of specific uncovered lines/branches; (3) using exact-matching retrieval, which fails for LLM hallucinated queries. To address these, we propose MuMuTestUp, a mutation-guided multi-agent framework with three specialized agents: Mutation Analysis (strengthens assertions via surviving mutants), Coverage Analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
