The Best of Both Worlds: Combining Learned Embeddings with Engineered Features for Accurate Prediction of Correct Patches
Haoye Tian, Kui Liu, Yinghua Li, Abdoul Kader Kabor\'e, Anil Koyuncu,, Andrew Habib, Li Li, Junhao Wen, Jacques Klein, Tegawend\'e F. Bissyand\'e

TL;DR
This paper presents a hybrid approach combining learned code embeddings with engineered features to improve the accuracy of predicting correct patches in automated program repair, outperforming existing methods.
Contribution
It introduces Panther, an enhanced prediction framework that integrates deep learned embeddings with engineered features, demonstrating superior accuracy over state-of-the-art techniques.
Findings
Learned embeddings improve patch correctness prediction.
Combining embeddings with engineered features enhances performance.
Panther outperforms Leopard and PATCH-SIM in accuracy metrics.
Abstract
A large body of the literature on automated program repair develops approaches where patches are automatically generated to be validated against an oracle (e.g., a test suite). Because such an oracle can be imperfect, the generated patches, although validated by the oracle, may actually be incorrect. Our empirical work investigates different representation learning approaches for code changes to derive embeddings that are amenable to similarity computations of patch correctness identification, and assess the possibility of accurate classification of correct patch by combining learned embeddings with engineered features. Experimental results demonstrate the potential of learned embeddings to empower Leopard (a patch correctness predicting framework implemented in this work) with learning algorithms in reasoning about patch correctness: a machine learning predictor with BERT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Software Reliability and Analysis Research
