Predictive Mutation Analysis via Natural Language Channel in Source Code
Jinhan Kim, Juyoung Jeon, Shin Hong, Shin Yoo

TL;DR
Seshat is a novel predictive mutation analysis technique that leverages natural language processing to accurately forecast entire kill matrices across program versions, significantly reducing mutation testing costs.
Contribution
It introduces Seshat, which predicts full kill matrices using natural language features, outperforming existing methods in accuracy and speed across different program versions.
Findings
Achieves an average F-score of 0.83 in predicting kill matrices.
Outperforms state-of-the-art predictive mutation testing by 0.14 F-score points.
Predictions are on average 39 times faster than actual mutation analysis.
Abstract
Mutation analysis can provide valuable insights into both System Under Test (SUT) and its test suite. However, it is not scalable due to the cost of building and testing a large number of mutants. Predictive Mutation Testing (PMT) has been proposed to reduce the cost of mutation testing, but it can only provide statistical inference about whether a mutant will be killed or not by the entire test suite. We propose Seshat, a Predictive Mutation Analysis (PMA) technique that can accurately predict the entire kill matrix, not just the mutation score of the given test suite. Seshat exploits the natural language channel in code, and learns the relationship between the syntactic and semantic concepts of each test case and the mutants it can kill, from a given kill matrix. The learnt model can later be used to predict the kill matrices for subsequent versions of the program, even after both the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
