Learning to Extend Program Graphs to Work-in-Progress Code
Xuechen Li, Chris J. Maddison, Daniel Tarlow

TL;DR
This paper introduces a method to adapt program graphs for incomplete code by learning to predict relationships between tokens, improving tasks like code completion and bug fixing during development.
Contribution
It proposes a novel approach to extend program graphs for work-in-progress code by learning edge relations, enabling better machine learning on incomplete code.
Findings
Relation-aware models outperform baseline on code completion.
Edge prediction improves variable misuse localization.
Fine-tuning edges enhances repair accuracy.
Abstract
Source code spends most of its time in a broken or incomplete state during software development. This presents a challenge to machine learning for code, since high-performing models typically rely on graph structured representations of programs derived from traditional program analyses. Such analyses may be undefined for broken or incomplete code. We extend the notion of program graphs to work-in-progress code by learning to predict edge relations between tokens, training on well-formed code before transferring to work-in-progress code. We consider the tasks of code completion and localizing and repairing variable misuse in a work-in-process scenario. We demonstrate that training relation-aware models with fine-tuned edges consistently leads to improved performance on both tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Teaching and Learning Programming
