An Empirical Study of the Effectiveness of an Ensemble of Stand-alone Sentiment Detection Tools for Software Engineering Datasets
Gias Uddin, Yann-Gael Gueheneuc, Foutse Khomh, Chanchal K Roy

TL;DR
This study evaluates the effectiveness of combining multiple stand-alone sentiment detection tools for software engineering datasets, finding that supervised ensemble methods with transformer models outperform individual tools and simple voting ensembles.
Contribution
It introduces Sentisead, a supervised ensemble approach that improves sentiment detection accuracy, and demonstrates the effectiveness of transformer-based models in this context.
Findings
Ensemble of tools is often complementary but voting fails to improve accuracy.
Sentisead with supervised learning improves F1-score by up to 4%.
Transformer-based ensemble with RoBERTa achieves the best F1-score of 0.805.
Abstract
Sentiment analysis in software engineering (SE) has shown promise to analyze and support diverse development activities. We report the results of an empirical study that we conducted to determine the feasibility of developing an ensemble engine by combining the polarity labels of stand-alone SE-specific sentiment detectors. Our study has two phases. In the first phase, we pick five SE-specific sentiment detection tools from two recently published papers by Lin et al. [31, 32], who first reported negative results with standalone sentiment detectors and then proposed an improved SE-specific sentiment detector, POME [31]. We report the study results on 17,581 units (sentences/documents) coming from six currently available sentiment benchmarks for SE. We find that the existing tools can be complementary to each other in 85-95% of the cases, i.e., one is wrong, but another is right. However,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSoftware Engineering Research · Topic Modeling · Software Engineering Techniques and Practices
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Multi-Head Attention · Attention Is All You Need · Linear Layer · Linear Warmup With Linear Decay · Layer Normalization · Adam · Attention Dropout · WordPiece · Dropout
