Using mutation testing to measure behavioural test diversity

Francisco Gomes de Oliveira Neto; Felix Dobslaw; Robert Feldt

arXiv:2010.09144·cs.SE·October 20, 2020

Using mutation testing to measure behavioural test diversity

Francisco Gomes de Oliveira Neto, Felix Dobslaw, Robert Feldt

PDF

TL;DR

This paper introduces mutation testing-based measures to quantify behavioral test diversity, improving test prioritization effectiveness by outperforming artifact-based methods across multiple open-source projects.

Contribution

It proposes novel mutation testing-derived metrics for behavioral diversity, addressing limitations of history-based approaches and enhancing test suite prioritization.

Findings

01

b-div measures outperform a-div and random selection

02

Average APFD increase of 19% to 31% across projects

03

Effective in prioritizing tests for fault detection

Abstract

Diversity has been proposed as a key criterion to improve testing effectiveness and efficiency.It can be used to optimise large test repositories but also to visualise test maintenance issues and raise practitioners' awareness about waste in test artefacts and processes. Even though these diversity-based testing techniques aim to exercise diverse behavior in the system under test (SUT), the diversity has mainly been measured on and between artefacts (e.g., inputs, outputs or test scripts). Here, we introduce a family of measures to capture behavioural diversity (b-div) of test cases by comparing their executions and failure outcomes. Using failure information to capture the SUT behaviour has been shown to improve effectiveness of history-based test prioritisation approaches. However, history-based techniques require reliable test execution logs which are often not available or can be…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.