Transformer-based Program Synthesis for Low-Data Environments
Jack Roper

TL;DR
This paper proposes a transformer-based approach for program synthesis in low-data environments, utilizing attributed grammars and program attributes to improve accuracy and understanding of generated code.
Contribution
It introduces a method combining attributed context-free grammars with attribute analysis to enhance transformer program synthesis, especially in low-data scenarios.
Findings
Attributed grammars enable efficient dataset generation.
Using program attributes improves synthesis quality in low-data settings.
The approach reduces errors in generated programs.
Abstract
Recent advancements in large pre-trained transformer models (GPT2/3, T5) have found use in program synthesis to generate programs that satisfy a set of input/output examples. However, these models perform poorly on long-horizon and low-data tasks, and often don't seem to understand the semantics of the languages they generate. We investigate an approach that tackles both of these issues, by using attributed context-free-grammars of programming languages to generate programs, and then analyzing generated programs so that they can be annotated with compile and runtime attributes, such as types, so that information about the program can be remembered during long-horizon generation. We firstly find that synthesized datasets can be made efficiently and can provide transformer models with enough data in order to perform well on some synthesis tasks. We also find that giving models access to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsParallel Computing and Optimization Techniques · Software Engineering Research · Software Testing and Debugging Techniques
