Code Generation for Unknown Libraries via Reading API Documentations
Koki Washio, Yusuke Miyao

TL;DR
This paper proposes a framework for open-domain code generation that leverages API documentation to generate code involving unknown libraries without additional training, demonstrating improved performance over baselines.
Contribution
It introduces a model that extracts relevant API signatures from documentation to generate code for unknown libraries, addressing a key challenge in open-domain code synthesis.
Findings
The model outperforms baseline encoder-decoder models on new split datasets.
It can generate code primitives for unknown libraries when signatures are noiseless.
Baseline models struggle with primitives from unseen libraries.
Abstract
Open-domain code generation is a challenging problem because the set of functions and classes that we use are frequently changed and extended in programming communities. We consider the challenge of code generation for unknown libraries without additional training. In this paper, we explore a framework of code generation that can refer to relevant API documentations like human programmers to handle unknown libraries. As a first step of this direction, we implement a model that can extract relevant code signatures from API documentations based on a natural language intent and copy primitives from the extracted signatures. Moreover, to evaluate code generation for unknown libraries and our framework, we extend an existing dataset of open-domain code generation and resplit it so that the evaluation data consist of only examples using the libraries that do not appear in the training data.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Advanced Malware Detection Techniques · Software Reliability and Analysis Research
