# Ahead of Time Mutation Based Fault Localisation using Statistical   Inference

**Authors:** Jinhan Kim, Gabin An, Robert Feldt, Shin Yoo

arXiv: 1902.09729 · 2022-09-15

## TL;DR

SIMFL is a mutation-based fault localization technique that performs mutation analysis in advance, enabling quick fault localization with significantly reduced post-failure analysis costs, and outperforms existing methods.

## Contribution

Introduces SIMFL, a novel MBFL approach that amortizes mutation analysis costs by precomputing mutants, enabling efficient fault localization after failures occur.

## Key findings

- Localizes up to 55% of faults at top rank
- 78% of faults within top five positions
- Maintains 80% accuracy with only 10% mutants

## Abstract

Mutation analysis can effectively capture the dependency between source code and test results. This has been exploited by Mutation Based Fault Localisation (MBFL) techniques. However, MBFL techniques suffer from the need to expend the high cost of mutation analysis after the observation of failures, which may present a challenge for its practical adoption. We introduce SIMFL (Statistical Inference for Mutation-based Fault Localisation), an MBFL technique that allows users to perform the mutation analysis in advance before a failure is observed, allowing the amortisation of the analysis cost. SIMFL uses mutants as artificial faults and aims to learn the failure patterns among test cases against different locations of mutations. Once a failure is observed, SIMFL requires either almost no or very small additional cost for analysis, depending on the used inference model. An empirical evaluation using Defects4J shows that SIMFL can successfully localise up to 113 out of 203 studied faults (55%) at the top, and 159 (78%) faults within the top five, significantly outperforming existing MBFL techniques while using the results of mutation analysis that has been undertaken before the test failure. The amortised cost of mutation analysis can be further reduced by mutation sampling: SIMFL retains 80% of its localisation accuracy at the top rank when using only 10% of generated mutants, compared to results obtained without sampling.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.09729/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1902.09729/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/1902.09729/full.md

---
Source: https://tomesphere.com/paper/1902.09729