Large Language Models for Equivalent Mutant Detection: How Far Are We?

Zhao Tian; Honglin Shu; Dong Wang; Xuejie Cao; Yasutaka Kamei; Junjie; Chen

arXiv:2408.01760·cs.SE·August 6, 2024

Large Language Models for Equivalent Mutant Detection: How Far Are We?

Zhao Tian, Honglin Shu, Dong Wang, Xuejie Cao, Yasutaka Kamei, Junjie, Chen

PDF

Open Access

TL;DR

This study empirically evaluates large language models for detecting equivalent mutants in Java code, showing they outperform existing methods with a good balance of cost and accuracy.

Contribution

First comprehensive empirical analysis of LLMs for equivalent mutant detection, demonstrating their superior performance and efficiency over traditional techniques.

Findings

01

LLM-based techniques outperform existing EMD methods by 35.69% in F1-score.

02

Fine-tuned code embeddings are the most effective LLM strategy.

03

LLMs offer a good balance between detection effectiveness and computational cost.

Abstract

Mutation testing is vital for ensuring software quality. However, the presence of equivalent mutants is known to introduce redundant cost and bias issues, hindering the effectiveness of mutation testing in practical use. Although numerous equivalent mutant detection (EMD) techniques have been proposed, they exhibit limitations due to the scarcity of training data and challenges in generalizing to unseen mutants. Recently, large language models (LLMs) have been extensively adopted in various code-related tasks and have shown superior performance by more accurately capturing program semantics. Yet the performance of LLMs in equivalent mutant detection remains largely unclear. In this paper, we conduct an empirical study on 3,302 method-level Java mutant pairs to comprehensively investigate the effectiveness and efficiency of LLMs for equivalent mutant detection. Specifically, we assess…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Biomedical Text Mining and Ontologies · Genetics, Bioinformatics, and Biomedical Research