Untangling Fine-Grained Code Changes
Mart\'in Dias (INRIA Lille - Nord Europe), Alberto Bacchelli, Georgios, Gousios, Damien Cassou (INRIA Lille - Nord Europe), St\'ephane Ducasse (INRIA, Lille - Nord Europe)

TL;DR
This paper introduces a new dataset and a tool called EpiceaUntangler that helps developers automatically untangle and create atomic commits from tangled code changes, improving code review and project analysis.
Contribution
It provides the first publicly available dataset of manually untangled code changes and presents a novel approach, EpiceaUntangler, for automatic untangling using fine-grained change information.
Findings
Median success rate of 91% in automatic untangling
Average success rate of 75% in real-world deployment
Effective assistance for developers in creating atomic commits
Abstract
After working for some time, developers commit their code changes to a version control system. When doing so, they often bundle unrelated changes (e.g., bug fix and refactoring) in a single commit, thus creating a so-called tangled commit. Sharing tangled commits is problematic because it makes review, reversion, and integration of these commits harder and historical analyses of the project less reliable. Researchers have worked at untangling existing commits, i.e., finding which part of a commit relates to which task. In this paper, we contribute to this line of work in two ways: (1) A publicly available dataset of untangled code changes, created with the help of two developers who accurately split their code changes into self contained tasks over a period of four months; (2) a novel approach, EpiceaUntangler, to help developers share untangled commits (aka. atomic commits) by using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Advanced Malware Detection Techniques
