Towards Trustworthy Sentiment Analysis in Software Engineering: Dataset Characteristics and Tool Selection
Martin Obaidi, Marc Herrmann, Jil Kl\"under, Kurt Schneider

TL;DR
This paper analyzes communication datasets from software engineering to evaluate sentiment analysis tools, proposing a method to select trustworthy tools based on dataset features, with transformer models showing strong but context-dependent performance.
Contribution
It introduces a dataset characterization and tool selection approach for sentiment analysis in software engineering, improving trustworthiness and contextual tool recommendation.
Findings
Transformer models like SetFit and RoBERTa perform well across datasets.
Dataset characteristics significantly influence tool effectiveness.
Context-dependent performance highlights need for tailored tool selection.
Abstract
Software development relies heavily on text-based communication, making sentiment analysis a valuable tool for understanding team dynamics and supporting trustworthy AI-driven analytics in requirements engineering. However, existing sentiment analysis tools often perform inconsistently across datasets from different platforms, due to variations in communication style and content. In this study, we analyze linguistic and statistical features of 10 developer communication datasets from five platforms and evaluate the performance of 14 sentiment analysis tools. Based on these results, we propose a mapping approach and questionnaire that recommends suitable sentiment analysis tools for new datasets, using their characteristic features as input. Our results show that dataset characteristics can be leveraged to improve tool selection, as platforms differ substantially in both linguistic…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
