Is Surprisal in Issue Trackers Actionable?
James Caddy, Markus Wagner, Christoph Treude, Earl T. Barr, Miltiadis, Allamanis

TL;DR
This paper explores whether surprisal, a measure from information theory, can be used to identify surprising and potentially important or problematic issues in software repositories, aiding early detection of challenges.
Contribution
It proposes a novel method using language models to measure surprisal in GitHub issues and pull requests, aiming to assess their importance and difficulty.
Findings
Potential correlation between surprisal and issue importance
Surprisal may help identify challenging issues early
Method applied to 5000 repositories for validation
Abstract
Background. From information theory, surprisal is a measurement of how unexpected an event is. Statistical language models provide a probabilistic approximation of natural languages, and because surprisal is constructed with the probability of an event occuring, it is therefore possible to determine the surprisal associated with English sentences. The issues and pull requests of software repository issue trackers give insight into the development process and likely contain the surprising events of this process. Objective. Prior works have identified that unusual events in software repositories are of interest to developers, and use simple code metrics-based methods for detecting them. In this study we will propose a new method for unusual event detection in software repositories using surprisal. With the ability to find surprising issues and pull requests, we intend to further analyse…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Reliability and Analysis Research · Software Engineering Techniques and Practices
