CCRep: Learning Code Change Representations via Pre-Trained Code Model and Query Back
Zhongxin Liu, Zhijie Tang, Xin Xia, Xiaohu Yang

TL;DR
CCRep is a novel approach that leverages pre-trained code models and a query back mechanism to generate high-quality, task-agnostic code change representations, improving performance across multiple software engineering tasks.
Contribution
This work introduces CCRep, a new code change encoder that jointly learns from diverse tasks using a pre-trained code model and a query back mechanism.
Findings
CCRep outperforms state-of-the-art methods on three tasks.
It effectively encodes code changes for multiple downstream tasks.
Experimental results validate its broad applicability and superior performance.
Abstract
Representing code changes as numeric feature vectors, i.e., code change representations, is usually an essential step to automate many software engineering tasks related to code changes, e.g., commit message generation and just-in-time defect prediction. Intuitively, the quality of code change representations is crucial for the effectiveness of automated approaches. Prior work on code changes usually designs and evaluates code change representation approaches for a specific task, and little work has investigated code change encoders that can be used and jointly trained on various tasks. To fill this gap, this work proposes a novel Code Change Representation learning approach named CCRep, which can learn to encode code changes as feature vectors for diverse downstream tasks. Specifically, CCRep regards a code change as the combination of its before-change and after-change code, leverages…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Web Data Mining and Analysis · Software System Performance and Reliability
