Automatically Extracting Subroutine Summary Descriptions from   Unstructured Comments

Zachary Eberhart; Alexander LeClair; Collin McMillan

arXiv:1912.10198·cs.SE·December 24, 2019

Automatically Extracting Subroutine Summary Descriptions from Unstructured Comments

Zachary Eberhart, Alexander LeClair, Collin McMillan

PDF

1 Repo

TL;DR

This paper introduces semi-automated and automated methods for extracting subroutine summaries from unstructured code comments, addressing the challenge of generating documentation for large, unannotated legacy codebases.

Contribution

It proposes novel crowdsourcing and automation techniques for extracting subroutine summaries without requiring prior annotations.

Findings

01

Validated approaches through experiments

02

Provided cost estimates for large-scale annotation

03

Demonstrated effectiveness in unstructured comments

Abstract

Summary descriptions of subroutines are short (usually one-sentence) natural language explanations of a subroutine's behavior and purpose in a program. These summaries are ubiquitous in documentation, and many tools such as JavaDocs and Doxygen generate documentation built around them. And yet, extracting summaries from unstructured source code repositories remains a difficult research problem -- it is very difficult to generate clean structured documentation unless the summaries are annotated by programmers. This becomes a problem in large repositories of legacy code, since it is cost prohibitive to retroactively annotate summaries in dozens or hundreds of old programs. Likewise, it is a problem for creators of automatic documentation generation algorithms, since these algorithms usually must learn from large annotated datasets, which do not exist for many programming languages. In…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

NoPro2019/NoPro_2019
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.