Ensemble Models for Neural Source Code Summarization of Subroutines
Alexander LeClair, Aakash Bansal, Collin McMillan

TL;DR
This paper investigates ensemble methods for neural source code summarization, demonstrating that combining diverse models improves summary quality by up to 14.8%, highlighting the benefits of leveraging model orthogonality.
Contribution
It introduces ensemble strategies that exploit the orthogonal contributions of different neural models, significantly enhancing code summarization performance.
Findings
Ensemble models improve summarization accuracy by up to 14.8%.
Different neural models contribute uniquely to prediction quality.
A small change in inference procedure yields substantial performance gains.
Abstract
A source code summary of a subroutine is a brief description of that subroutine. Summaries underpin a majority of documentation consumed by programmers, such as the method summaries in JavaDocs. Source code summarization is the task of writing these summaries. At present, most state-of-the-art approaches for code summarization are neural network-based solutions akin to seq2seq, graph2seq, and other encoder-decoder architectures. The input to the encoder is source code, while the decoder helps predict the natural language summary. While these models tend to be similar in structure, evidence is emerging that different models make different contributions to prediction quality -- differences in model performance are orthogonal and complementary rather than uniform over the entire dataset. In this paper, we explore the orthogonal nature of different neural code summarization approaches and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Software System Performance and Reliability · Topic Modeling
