Python Code Generation by Asking Clarification Questions
Haau-Sing Li, Mohsen Mesgar, Andr\'e F. T. Martins, Iryna Gurevych

TL;DR
This paper proposes a new approach to code generation from natural language by incorporating clarification questions, demonstrating that this method improves the accuracy of generated code and introducing a new dataset for this task.
Contribution
It introduces a novel setup for code generation that involves asking clarification questions, along with a new dataset, CodeClarQA, to facilitate research in this area.
Findings
Clarification questions improve code generation accuracy
Pretrained models perform better with clarification-based interactions
The new dataset enables research on when and what questions to ask
Abstract
Code generation from text requires understanding the user's intent from a natural language description and generating an executable code snippet that satisfies this intent. While recent pretrained language models demonstrate remarkable performance for this task, these models fail when the given natural language description is under-specified. In this work, we introduce a novel and more realistic setup for this task. We hypothesize that the under-specification of a natural language description can be resolved by asking clarification questions. Therefore, we collect and introduce a new dataset named CodeClarQA containing pairs of natural language descriptions and code with created synthetic clarification questions and answers. The empirical results of our evaluation of pretrained language model performance on code generation show that clarifications result in more precisely generated…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Software Engineering Research · Topic Modeling
Methodsfail
