Applying Bayesian Analysis Guidelines to Empirical Software Engineering Data: The Case of Programming Languages and Code Quality
Carlo A. Furia, Richard Torkar, Robert Feldt

TL;DR
This paper demonstrates how Bayesian analysis guidelines can be applied to empirical software engineering data, specifically analyzing programming languages and code quality, to produce valid, flexible, and insightful results.
Contribution
It adapts Bayesian analysis guidelines for empirical software engineering, illustrating their application through a reanalysis of a GitHub dataset on programming languages and code quality.
Findings
Bayesian techniques provide principled, flexible analysis of software engineering data.
The reanalysis offers new insights into the relationship between programming languages and code quality.
Guidelines help produce convincing, valid results that advance empirical research.
Abstract
Statistical analysis is the tool of choice to turn data into information, and then information into empirical knowledge. To be valid, the process that goes from data to knowledge should be supported by detailed, rigorous guidelines, which help ferret out issues with the data or model, and lead to qualified results that strike a reasonable balance between generality and practical relevance. Such guidelines are being developed by statisticians to support the latest techniques for Bayesian data analysis. In this article, we frame these guidelines in a way that is apt to empirical research in software engineering. To demonstrate the guidelines in practice, we apply them to reanalyze a GitHub dataset about code quality in different programming languages. The dataset's original analysis (Ray et al., 2014) and a critical reanalysis (Berger at al., 2019) have attracted considerable attention…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Software Testing and Debugging Techniques
