Evaluating Representation Learning of Code Changes for Predicting Patch Correctness in Program Repair
Haoye Tian, Kui Liu, Abdoul Kader Kabore\'e, Anil Koyuncu, Li Li,, Jacques Klein, Tegawend\'e F. Bissyand\'e

TL;DR
This paper explores how learned code representations, especially embeddings from neural networks like BERT, can predict patch correctness in program repair, showing promising results comparable to existing dynamic-based methods.
Contribution
It investigates the effectiveness of representation learning approaches for code changes in predicting patch correctness, highlighting the potential of neural embeddings in this task.
Findings
Embeddings from BERT-based models achieved an AUC of about 0.8.
Learned representations are competitive with state-of-the-art dynamic methods.
Representations can complement manually engineered features in patch correctness prediction.
Abstract
A large body of the literature of automated program repair develops approaches where patches are generated to be validated against an oracle (e.g., a test suite). Because such an oracle can be imperfect, the generated patches, although validated by the oracle, may actually be incorrect. While the state of the art explore research directions that require dynamic information or rely on manually-crafted heuristics, we study the benefit of learning code representations to learn deep features that may encode the properties of patch correctness. Our work mainly investigates different representation learning approaches for code changes to derive embeddings that are amenable to similarity computations. We report on findings based on embeddings produced by pre-trained and re-trained neural networks. Experimental results demonstrate the potential of embeddings to empower learning algorithms in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software Reliability and Analysis Research
