GAP-Gen: Guided Automatic Python Code Generation
Junchen Zhao, Yurun Song, Junlin Wang, Ian G. Harris

TL;DR
GAP-Gen is a novel method for automatic Python code generation that uses syntactic and semantic constraints to guide fine-tuning of transformer models, resulting in improved performance over previous approaches.
Contribution
The paper introduces Syntax-Flow and Variable-Flow constraints and focuses on fine-tuning transformer models, reducing computational costs while enhancing code generation accuracy.
Findings
GAP-Gen outperforms previous methods on Python code generation benchmarks.
The use of syntactic and semantic constraints improves code quality.
Fine-tuning rather than pretraining maintains high performance with lower computational resources.
Abstract
Automatic code generation from natural language descriptions can be highly beneficial during the process of software development. In this work, we propose GAP-Gen, a Guided Automatic Python Code Generation method based on Python syntactic constraints and semantic constraints. We first introduce Python syntactic constraints in the form of Syntax-Flow, which is a simplified version of Abstract Syntax Tree (AST) reducing the size and high complexity of Abstract Syntax Tree but maintaining crucial syntactic information of Python code. In addition to Syntax-Flow, we introduce Variable-Flow which abstracts variable and function names consistently through out the code. In our work, rather than pretraining, we focus on modifying the finetuning process which reduces computational requirements but retains high generation performance on automatic Python code generation task. GAP-Gen fine-tunes the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Software Engineering Research · Topic Modeling
MethodsAttention Is All You Need · Linear Layer · Byte Pair Encoding · Inverse Square Root Schedule · Residual Connection · Dropout · SentencePiece · Layer Normalization · Refunds@Expedia|||How do I get a full refund from Expedia? · Dense Connections
