$\mu$BERT: Mutation Testing using Pre-Trained Language Models
Renzo Degiovanni, Mike Papadakis

TL;DR
$BERT is a mutation testing tool leveraging pre-trained language models to generate mutants, demonstrating improved fault detection and cost-effectiveness over traditional methods like PiTest.
Contribution
It introduces a novel mutation testing approach using CodeBERT for mutant generation, enhancing fault detection and cost efficiency.
Findings
Detects 27 out of 40 real faults, outperforming PiTest's 26.
Achieves twice the cost-effectiveness compared to PiTest.
Produces mutants that improve program assertion inference and specification quality.
Abstract
We introduce BERT, a mutation testing tool that uses a pre-trained language model (CodeBERT) to generate mutants. This is done by masking a token from the expression given as input and using CodeBERT to predict it. Thus, the mutants are generated by replacing the masked tokens with the predicted ones. We evaluate BERT on 40 real faults from Defects4J and show that it can detect 27 out of the 40 faults, while the baseline (PiTest) detects 26 of them. We also show that BERT can be 2 times more cost-effective than PiTest, when the same number of mutants are analysed. Additionally, we evaluate the impact of BERT's mutants when used by program assertion inference techniques, and show that they can help in producing better specifications. Finally, we discuss about the quality and naturalness of some interesting mutants produced by BERT during our experimental…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
