HierarchyNet: Learning to Summarize Source Code with Heterogeneous Representations
Minh Huynh Nguyen, Nghi D. Q. Bui, Truong Son Hy, Long Tran-Thanh,, Tien N. Nguyen

TL;DR
HierarchyNet introduces a hierarchical approach using heterogeneous code representations and specialized neural modules to improve code summarization, effectively capturing multi-level code features and dependencies.
Contribution
It presents a novel HierarchyNet architecture that processes heterogeneous code representations with multiple neural modules and a Hierarchical-Aware Cross Attention layer, outperforming existing methods.
Findings
Surpasses state-of-the-art code summarization techniques
Effectively captures lexical, syntactic, and semantic code features
Preserves dependencies between code elements
Abstract
We propose a novel method for code summarization utilizing Heterogeneous Code Representations (HCRs) and our specially designed HierarchyNet. HCRs effectively capture essential code features at lexical, syntactic, and semantic levels by abstracting coarse-grained code elements and incorporating fine-grained program elements in a hierarchical structure. Our HierarchyNet method processes each layer of the HCR separately through a unique combination of the Heterogeneous Graph Transformer, a Tree-based CNN, and a Transformer Encoder. This approach preserves dependencies between code elements and captures relations through a novel Hierarchical-Aware Cross Attention layer. Our method surpasses current state-of-the-art techniques, such as PA-Former, CAST, and NeuralCodeSum.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software Testing and Debugging Techniques · Advanced Malware Detection Techniques
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Softmax · Dense Connections · Position-Wise Feed-Forward Layer · Byte Pair Encoding · Multi-Head Attention · Absolute Position Encodings · Dropout
