Yet Another Combination of IR- and Neural-based Comment Generation
Huang Yuchao, Wei Moshi, Wang Song, Wang Junjie, Wang Qing

TL;DR
This paper introduces a dynamic method combining IR- and neural-based techniques for code comment generation, significantly improving performance over existing static combinations by adaptively selecting the best approach for each code snippet.
Contribution
The paper proposes a novel dynamic combination approach using a classifier to select between IR- and neural-based methods, enhancing comment generation accuracy.
Findings
Achieved a BLEU score of 25.45, outperforming previous methods.
Improved state-of-the-art results by 41% over IR-based, 26% over neural-based, and 7% over static combinations.
Demonstrated effectiveness on a large-scale Java dataset.
Abstract
Code comment generation techniques aim to generate natural language descriptions for source code. There are two orthogonal approaches for this task, i.e., information retrieval (IR) based and neural-based methods. Recent studies have focused on combining their strengths by feeding the input code and its similar code snippets retrieved by the IR-based approach to the neural-based approach, which can enhance the neural-based approach's ability to output low-frequency words and further improve the performance. However, despite the tremendous progress, our pilot study reveals that the current combination is not generalizable and can lead to performance degradation. In this paper, we propose a straightforward but effective approach to tackle the issue of existing combinations of these two comment generation approaches. Instead of binding IR- and neural-based approaches statically, we combine…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Natural Language Processing Techniques
