Learning to Represent Programs with Graphs
Miltiadis Allamanis, Marc Brockschmidt, Mahmoud Khademi

TL;DR
This paper introduces a graph-based approach to represent source code's syntax and semantics, enabling deep learning models to better understand and reason about program structure for tasks like variable naming and bug detection.
Contribution
It presents a method to construct graphs from source code and scales Gated Graph Neural Networks for large program graphs, demonstrating improved reasoning over code structures.
Findings
Graph representations improve variable naming accuracy.
Models can identify bugs in open-source projects.
Structured modeling outperforms less structured methods.
Abstract
Learning tasks on source code (i.e., formal languages) have been considered recently, but most work has tried to transfer natural language methods and does not capitalize on the unique opportunities offered by code's known syntax. For example, long-range dependencies induced by using the same variable or function in distant locations are often not considered. We propose to use graphs to represent both the syntactic and semantic structure of code and use graph-based deep learning methods to learn to reason over program structures. In this work, we present how to construct graphs from source code and how to scale Gated Graph Neural Networks training to such large graphs. We evaluate our method on two tasks: VarNaming, in which a network attempts to predict the name of a variable given its usage, and VarMisuse, in which the network learns to reason about selecting the correct variable…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Topic Modeling
MethodsGated Graph Sequence Neural Networks
