TL;DR
This paper evaluates how well neural program models generalize to semantically equivalent programs with different syntax, revealing limitations and potential improvements for source code analysis tasks.
Contribution
It introduces a method to assess neural program models' generalizability using semantic-preserving transformations and compares multiple models across datasets.
Findings
Neural models often fail to generalize after semantic-preserving changes.
Models based on data and control dependencies perform better than syntax-only models.
Larger and more diverse training datasets improve model generalizability.
Abstract
With the prevalence of publicly available source code repositories to train deep neural network models, neural program models can do well in source code analysis tasks such as predicting method names in given programs that cannot be easily done by traditional program analysis techniques. Although such neural program models have been tested on various existing datasets, the extent to which they generalize to unforeseen source code is largely unknown. Since it is very challenging to test neural program models on all unforeseen programs, in this paper, we propose to evaluate the generalizability of neural program models with respect to semantic-preserving transformations: a generalizable neural program model should perform equally well on programs that are of the same semantics but of different lexical appearances and syntactical structures. We compare the results of various neural program…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsGated Graph Sequence Neural Networks
