Code Summarization with Structure-induced Transformer
Hongqiu Wu, Hai Zhao, Min Zhang

TL;DR
This paper introduces a structure-induced Transformer model for code summarization that incorporates structural clues into self-attention, achieving state-of-the-art results on benchmark datasets.
Contribution
It proposes a novel structure-induced self-attention mechanism to effectively encode structural information in source code using Transformer architecture.
Findings
Achieves new state-of-the-art results on code summarization benchmarks.
Incorporating structural clues improves Transformer performance.
Outperforms previous models using structure-based traversal and GNNs.
Abstract
Code summarization (CS) is becoming a promising area in recent language understanding, which aims to generate sensible human language automatically for programming language in the format of source code, serving in the most convenience of programmer developing. It is well known that programming languages are highly structured. Thus previous works attempt to apply structure-based traversal (SBT) or non-sequential models like Tree-LSTM and graph neural network (GNN) to learn structural program semantics. However, it is surprising that incorporating SBT into advanced encoder like Transformer instead of LSTM has been shown no performance gain, which lets GNN become the only rest means modeling such necessary structural clue in source code. To release such inconvenience, we propose structure-induced Transformer, which encodes sequential code inputs with multi-view structural clues in terms of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Natural Language Processing Techniques
MethodsGraph Neural Network · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Tanh Activation · Sigmoid Activation · Long Short-Term Memory · Byte Pair Encoding · Multi-Head Attention · Dropout
