Combining Contexts from Multiple Sources for Documentation-Specific Code   Example Generation

Junaed Younus Khan; Gias Uddin

arXiv:2303.14542·cs.SE·March 28, 2023·1 cites

Combining Contexts from Multiple Sources for Documentation-Specific Code Example Generation

Junaed Younus Khan, Gias Uddin

PDF

Open Access

TL;DR

This paper explores automatic generation of code examples for documentation using GPT-3 based Codex, demonstrating promising passability and relevance rates, and showing that including error logs improves code execution success.

Contribution

It introduces a novel approach to generate documentation-specific code examples using Codex and evaluates the impact of error logs on code passability.

Findings

01

72.5% code examples executed without error

02

82.5% code examples relevant to documentation

03

Error logs improve passability to 87.5%

Abstract

Code example is a crucial part of good documentation. It helps the developers to understand the documentation easily and use the corresponding code unit (e.g., method) properly. However, many official documentation still lacks (good) code example and it is one of the common documentation issues as found by several studies. Hence in this paper, we consider automatic code example generation for documentation, a direction less explored by the existing research. We employ Codex, a GPT-3 based model, pre-trained on both natural and programming languages to generate code examples from source code and documentation given as input. Our preliminary investigation on 40 scikit-learn methods reveals that this approach is able to generate good code examples where 72.5% code examples were executed without error (passability) and 82.5% properly dealt with the target method and documentation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Machine Learning and Data Classification · Software System Performance and Reliability