Statement-based Memory for Neural Source Code Summarization
Aakash Bansal, Siyuan Jiang, Sakib Haque, and Collin McMillan

TL;DR
This paper introduces a statement-based memory encoder for neural source code summarization that captures code flow without dynamic analysis, significantly improving summary quality over existing methods.
Contribution
The paper proposes a novel statement-based memory encoder that learns code flow during training, enhancing neural code summarization without dynamic analysis.
Findings
Significant improvement over state-of-the-art methods
Effective capture of code flow without dynamic analysis
Enhanced accuracy in code summarization
Abstract
Source code summarization is the task of writing natural language descriptions of source code behavior. Code summarization underpins software documentation for programmers. Short descriptions of code help programmers understand the program quickly without having to read the code itself. Lately, neural source code summarization has emerged as the frontier of research into automated code summarization techniques. By far the most popular targets for summarization are program subroutines. The idea, in a nutshell, is to train an encoder-decoder neural architecture using large sets of examples of subroutines extracted from code repositories. The encoder represents the code and the decoder represents the summary. However, most current approaches attempt to treat the subroutine as a single unit. For example, by taking the entire subroutine as input to a Transformer or RNN-based encoder. But…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Natural Language Processing Techniques
MethodsAttention Is All You Need · Linear Layer · Label Smoothing · Layer Normalization · Absolute Position Encodings · Multi-Head Attention · Softmax · Dense Connections · Dropout · Residual Connection
