Fix Bugs with Transformer through a Neural-Symbolic Edit Grammar
Yaojie Hu, Xingjian Shi, Qiang Zhou, Lee Pike

TL;DR
NSEdit introduces a Transformer-based neural-symbolic approach for automatic code bug fixing by predicting edit sequences within a formal grammar, achieving state-of-the-art accuracy on code repair benchmarks.
Contribution
The paper presents NSEdit, a novel neural-symbolic Transformer model that predicts code edits using a formal grammar, improving bug repair accuracy over existing methods.
Findings
Achieved 24.04% accuracy on the Tufano small dataset of CodeXGLUE.
Demonstrated robustness across different code packages and bug types.
Validated the effectiveness of each component through detailed analysis.
Abstract
We introduce NSEdit (neural-symbolic edit), a novel Transformer-based code repair method. Given only the source code that contains bugs, NSEdit predicts an editing sequence that can fix the bugs. The edit grammar is formulated as a regular language, and the Transformer uses it as a neural-symbolic scripting interface to generate editing programs. We modify the Transformer and add a pointer network to select the edit locations. An ensemble of rerankers are trained to re-rank the editing sequences generated by beam search. We fine-tune the rerankers on the validation set to reduce over-fitting. NSEdit is evaluated on various code repair datasets and achieved a new state-of-the-art accuracy () on the Tufano small dataset of the CodeXGLUE benchmark. NSEdit performs robustly when programs vary from packages to packages and when buggy programs are concrete. We conduct detailed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Software Testing and Debugging Techniques
MethodsAttention Is All You Need · Repair · Linear Layer · Sigmoid Activation · Adam · Absolute Position Encodings · Tanh Activation · Long Short-Term Memory · Byte Pair Encoding · Position-Wise Feed-Forward Layer
