GN-Transformer: Fusing Sequence and Graph Representation for Improved Code Summarization
Junyan Cheng, Iordanis Fostiropoulos, and Barry Boehm

TL;DR
GN-Transformer leverages fused sequence and graph representations of source code, specifically using Abstract Syntax Trees, to significantly improve automatic code summarization performance and human-perceived quality.
Contribution
The paper introduces GN-Transformer, a novel end-to-end model that fuses sequence and graph modalities for code summarization, achieving state-of-the-art results.
Findings
Achieves state-of-the-art performance on two datasets
Outperforms previous models in BLEU, METEOR, ROUGE-L metrics
Enhances human perceived quality of code summaries
Abstract
As opposed to natural languages, source code understanding is influenced by grammatical relationships between tokens regardless of their identifier name. Graph representations of source code such as Abstract Syntax Tree (AST) can capture relationships between tokens that are not obvious from the source code. We propose a novel method, GN-Transformer to learn end-to-end on a fused sequence and graph modality we call Syntax-Code-Graph (SCG). GN-Transformer expands on Graph Networks (GN) framework using a self-attention mechanism. SCG is the result of the early fusion between a source code snippet and the AST representation. We perform experiments on the structure of SCG, an ablation study on the model design, and the hyper-parameters to conclude that the performance advantage is from the fused representation. The proposed methods achieve state-of-the-art performance in two code…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Advanced Malware Detection Techniques
