Test Set Diameter: Quantifying the Diversity of Sets of Test Cases
Robert Feldt, Simon Poulding, David Clark, Shin Yoo

TL;DR
This paper introduces the test set diameter (TSDm), a new universal metric for quantifying the diversity of test sets across data types, aiding in better test selection and software quality assessment.
Contribution
It proposes TSDm, extending previous pairwise diversity metrics with information theory, applicable to any data type and test-related information, enhancing test set analysis.
Findings
TSDm can select test sets with higher coverage than random.
It is applicable regardless of data type or test information.
It enables early test design and complements existing testing methods.
Abstract
A common and natural intuition among software testers is that test cases need to differ if a software system is to be tested properly and its quality ensured. Consequently, much research has gone into formulating distance measures for how test cases, their inputs and/or their outputs differ. However, common to these proposals is that they are data type specific and/or calculate the diversity only between pairs of test inputs, traces or outputs. We propose a new metric to measure the diversity of sets of tests: the test set diameter (TSDm). It extends our earlier, pairwise test diversity metrics based on recent advances in information theory regarding the calculation of the normalized compression distance (NCD) for multisets. An advantage is that TSDm can be applied regardless of data type and on any test-related information, not only the test inputs. A downside is the increased…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Testing and Debugging Techniques · Software Engineering Research · Advanced Malware Detection Techniques
