# A Neural Model for Generating Natural Language Summaries of Program   Subroutines

**Authors:** Alexander LeClair, Siyuan Jiang, Collin McMillan

arXiv: 1902.01954 · 2019-02-07

## TL;DR

This paper introduces a neural model that generates natural language summaries of program subroutines by combining code words with structural information from ASTs, improving performance especially when internal documentation is lacking.

## Contribution

The proposed model uniquely processes code words and structure separately, enabling better summaries without relying on internal documentation, advancing code summarization methods.

## Key findings

- Outperforms existing baseline techniques in code summarization
- Effective even with minimal internal documentation
- Evaluated on a large dataset of 2.1 million Java methods

## Abstract

Source code summarization -- creating natural language descriptions of source code behavior -- is a rapidly-growing research topic with applications to automatic documentation generation, program comprehension, and software maintenance. Traditional techniques relied on heuristics and templates built manually by human experts. Recently, data-driven approaches based on neural machine translation have largely overtaken template-based systems. But nearly all of these techniques rely almost entirely on programs having good internal documentation; without clear identifier names, the models fail to create good summaries. In this paper, we present a neural model that combines words from code with code structure from an AST. Unlike previous approaches, our model processes each data source as a separate input, which allows the model to learn code structure independent of the text in code. This process helps our approach provide coherent summaries in many cases even when zero internal documentation is provided. We evaluate our technique with a dataset we created from 2.1m Java methods. We find improvement over two baseline techniques from SE literature and one from NLP literature.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.01954/full.md

## Figures

23 figures with captions in the complete paper: https://tomesphere.com/paper/1902.01954/full.md

## References

60 references — full list in the complete paper: https://tomesphere.com/paper/1902.01954/full.md

---
Source: https://tomesphere.com/paper/1902.01954