Statement-based Memory for Neural Source Code Summarization

Aakash Bansal; Siyuan Jiang; Sakib Haque; and Collin McMillan

arXiv:2307.11709·cs.AI·July 24, 2023

Statement-based Memory for Neural Source Code Summarization

Aakash Bansal, Siyuan Jiang, Sakib Haque, and Collin McMillan

PDF

Open Access 1 Repo

TL;DR

This paper introduces a statement-based memory encoder for neural source code summarization that captures code flow without dynamic analysis, significantly improving summary quality over existing methods.

Contribution

The paper proposes a novel statement-based memory encoder that learns code flow during training, enhancing neural code summarization without dynamic analysis.

Findings

01

Significant improvement over state-of-the-art methods

02

Effective capture of code flow without dynamic analysis

03

Enhanced accuracy in code summarization

Abstract

Source code summarization is the task of writing natural language descriptions of source code behavior. Code summarization underpins software documentation for programmers. Short descriptions of code help programmers understand the program quickly without having to read the code itself. Lately, neural source code summarization has emerged as the frontier of research into automated code summarization techniques. By far the most popular targets for summarization are program subroutines. The idea, in a nutshell, is to train an encoder-decoder neural architecture using large sets of examples of subroutines extracted from code repositories. The encoder represents the code and the decoder represents the summary. However, most current approaches attempt to treat the subroutine as a single unit. For example, by taking the entire subroutine as input to a Transformer or RNN-based encoder. But…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

aakashba/smncode2022
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Topic Modeling · Natural Language Processing Techniques

MethodsAttention Is All You Need · Linear Layer · Label Smoothing · Layer Normalization · Absolute Position Encodings · Multi-Head Attention · Softmax · Dense Connections · Dropout · Residual Connection