Bad Smells in Software Analytics Papers
Tim Menzies, Martin Shepperd

TL;DR
This paper identifies 12 common issues, termed 'bad smells', in software analytics research papers, aiming to improve study reliability and guide both producers and consumers of such studies.
Contribution
It introduces the 'bad smells' metaphor to diagnose and discuss potential problems in software analytics research papers, providing a basis for improving study quality.
Findings
List of 12 'bad smells' with examples
Impact of 'bad smells' on study validity demonstrated
Encourages ongoing debate on research validity
Abstract
CONTEXT: There has been a rapid growth in the use of data analytics to underpin evidence-based software engineering. However the combination of complex techniques, diverse reporting standards and poorly understood underlying phenomena are causing some concern as to the reliability of studies. OBJECTIVE: Our goal is to provide guidance for producers and consumers of software analytics studies (computational experiments and correlation studies). METHOD: We propose using "bad smells", i.e., surface indications of deeper problems and popular in the agile software community and consider how they may be manifest in software analytics studies. RESULTS: We list 12 "bad smells" in software analytics papers (and show their impact by examples). CONCLUSIONS: We believe the metaphor of bad smell is a useful device. Therefore we encourage more debate on what contributes to the validty of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Scientific Computing and Data Management
