TL;DR
This paper investigates the use of source code embeddings, specifically Doc2Vec, to rank plausible patches in JavaScript APR, revealing limitations in semantic understanding but also potential for aiding patch analysis.
Contribution
It explores the effectiveness of source code embeddings for patch ranking in APR and highlights their limitations and potential in understanding code semantics.
Findings
Plain document embeddings may misclassify patches due to poor semantic capture.
Embeddings can sometimes provide useful insights into code similarity.
The study offers insights into the challenges of semantic understanding in APR.
Abstract
Despite the immense popularity of the Automated Program Repair (APR) field, the question of patch validation is still open. Most of the present-day approaches follow the so-called Generate-and-Validate approach, where first a candidate solution is being generated and after validated against an oracle. The latter, however, might not give a reliable result, because of the imperfections in such oracles; one of which is usually the test suite. Although (re-) running the test suite is right under one's nose, in real life applications the problem of over- and underfitting often occurs, resulting in inadequate patches. Efforts that have been made to tackle with this problem include patch filtering, test suite expansion, careful patch producing and many more. Most approaches to date use post-filtering relying either on test execution traces or make use of some similarity concept measured on the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRepair
