A Topic Guided Pointer-Generator Model for Generating Natural Language Code Summaries
Xin Wang, Xin Peng, Jun Sun, Yifan Zhao, Chi Chen, and Jinkai Fan

TL;DR
This paper introduces ToPNN, a neural network model that leverages broader context and topic guidance to improve automatic code summarization, significantly outperforming existing methods on a large Java dataset.
Contribution
The novel ToPNN model incorporates class-level topics and a copy mechanism to enhance code summarization beyond existing neural translation approaches.
Findings
Significant performance improvement over state-of-the-art models
Class topics and copy mechanism positively impact summarization quality
Effective at the method level for Java code
Abstract
Code summarization is the task of generating natural language description of source code, which is important for program understanding and maintenance. Existing approaches treat the task as a machine translation problem (e.g., from Java to English) and applied Neural Machine Translation models to solve the problem. These approaches only consider a given code unit (e.g., a method) without its broader context. The lacking of context may hinder the NMT model from gathering sufficient information for code summarization. Furthermore, existing approaches use a fixed vocabulary and do not fully consider the words in code, while many words in the code summary may come from the code. In this work, we present a neural network model named ToPNN for code summarization, which uses the topics in a broader context (e.g., class) to guide the neural networks that combine the generation of new words and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Software Engineering Research
