An Empirical Analysis of Git Commit Logs for Potential Inconsistency in Code Clones
Reishi Yokomori, Katsuro Inoue

TL;DR
This study analyzes git commit logs of code clone pairs across 45 Apache repositories, revealing infrequent changes, partial co-change, and a significant portion of potentially inconsistent clone pairs, highlighting the need for better clone management.
Contribution
It provides an empirical analysis of commit logs for code clone pairs, revealing patterns of change frequency, co-change ratio, and potential inconsistencies, which were previously underexplored.
Findings
Clone snippets are infrequently changed, typically 2-3 times.
About 50% of clone changes are co-changed, with 10-20% potentially inconsistent.
35-65% of clone pairs are potentially inconsistent, indicating a need for better management.
Abstract
Code clones are code snippets that are identical or similar to other snippets within the same or different files. They are often created through copy-and-paste practices and modified during development and maintenance activities. Since a pair of code clones, known as a clone pair, has a possible logical coupling between them, it is expected that changes to each snippet are made simultaneously (co-changed) and consistently. There is extensive research on code clones, including studies related to the co-change of clones; however, detailed analysis of commit logs for code clone pairs has been limited. In this paper, we investigate the commit logs of code snippets from clone pairs, using the git-log command to extract changes to cloned code snippets. We analyzed 45 repositories owned by the Apache Software Foundation on GitHub and addressed three research questions regarding commit…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEvolutionary Algorithms and Applications · Advanced Malware Detection Techniques
