Dopamin: Transformer-based Comment Classifiers through Domain Post-Training and Multi-level Layer Aggregation
Nam Le Hai, Nghi D. Q. Bui

TL;DR
Dopamin is a Transformer-based comment classifier that improves comment representation and knowledge sharing across languages, outperforming baseline models in accuracy while maintaining practical inference times.
Contribution
The paper introduces Dopamin, a novel Transformer-based model with domain post-training and multi-level layer aggregation for effective code comment classification across multiple programming languages.
Findings
Outperforms STACC baseline by 3% in F1-score on NLBSE'24 dataset.
Achieves robust comment classification performance across multiple languages.
Maintains comparable inference time for practical deployment.
Abstract
Code comments provide important information for understanding the source code. They can help developers understand the overall purpose of a function or class, as well as identify bugs and technical debt. However, an overabundance of comments is meaningless and counterproductive. As a result, it is critical to automatically filter out these comments for specific purposes. In this paper, we present Dopamin, a Transformer-based tool for dealing with this issue. Our model excels not only in presenting knowledge sharing of common categories across multiple languages, but also in achieving robust performance in comment classification by improving comment representation. As a result, it outperforms the STACC baseline by 3% on the NLBSE'24 Tool Competition dataset in terms of average F1-score, while maintaining a comparable inference time for practical use. The source code is publicity…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Advanced Text Analysis Techniques · Sentiment Analysis and Opinion Mining
