Code Summarization with Structure-induced Transformer

Hongqiu Wu; Hai Zhao; Min Zhang

arXiv:2012.14710·cs.CL·June 2, 2021·5 cites

Code Summarization with Structure-induced Transformer

Hongqiu Wu, Hai Zhao, Min Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a structure-induced Transformer model for code summarization that incorporates structural clues into self-attention, achieving state-of-the-art results on benchmark datasets.

Contribution

It proposes a novel structure-induced self-attention mechanism to effectively encode structural information in source code using Transformer architecture.

Findings

01

Achieves new state-of-the-art results on code summarization benchmarks.

02

Incorporating structural clues improves Transformer performance.

03

Outperforms previous models using structure-based traversal and GNNs.

Abstract

Code summarization (CS) is becoming a promising area in recent language understanding, which aims to generate sensible human language automatically for programming language in the format of source code, serving in the most convenience of programmer developing. It is well known that programming languages are highly structured. Thus previous works attempt to apply structure-based traversal (SBT) or non-sequential models like Tree-LSTM and graph neural network (GNN) to learn structural program semantics. However, it is surprising that incorporating SBT into advanced encoder like Transformer instead of LSTM has been shown no performance gain, which lets GNN become the only rest means modeling such necessary structural clue in source code. To release such inconvenience, we propose structure-induced Transformer, which encodes sequential code inputs with multi-view structural clues in terms of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

gingasan/sit3
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSoftware Engineering Research · Topic Modeling · Natural Language Processing Techniques

MethodsGraph Neural Network · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Tanh Activation · Sigmoid Activation · Long Short-Term Memory · Byte Pair Encoding · Multi-Head Attention · Dropout