CodeMapper: A Language-Agnostic Approach to Mapping Code Regions Across Commits
Huimin Hu, Michael Pradel

TL;DR
CodeMapper is a language-agnostic tool that accurately maps specific code regions across commits, surpassing existing methods by analyzing diffs, code movements, and similarities in multiple programming languages.
Contribution
It introduces a novel, language-independent approach for code region mapping across commits, addressing limitations of prior techniques that focus on specific languages or elements.
Findings
Achieves 71.0%--94.5% accuracy in code region mapping
Outperforms baseline methods by 1.5--58.8 percentage points
Validated on four diverse datasets including ten programming languages
Abstract
During software evolution, developers commonly face the problem of mapping a specific code region from one commit to another. For example, they may want to determine how the condition of an if-statement, a specific line in a configuration file, or the definition of a function changes. We call this the code mapping problem. Existing techniques, such as git diff, address this problem only insufficiently because they show all changes made to a file instead of focusing on a code region of the developer's choice. Other techniques focus on specific code elements and programming languages (e.g., methods in Java), limiting their applicability. This paper introduces CodeMapper, an approach to address the code mapping problem in a way that is independent of specific program elements and programming languages. Given a code region in one commit, CodeMapper finds the corresponding region in another…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software System Performance and Reliability
