GeeSanBhava: Sentiment Tagged Sinhala Music Video Comment Data Set
Yomal De Mel, Nisansa de Silva

TL;DR
This paper introduces GeeSanBhava, a high-quality Sinhala music video comment dataset annotated for emotions, and demonstrates the effectiveness of a multi-layer perceptron model in emotion recognition with promising results.
Contribution
The study provides a novel annotated Sinhala music comment dataset and evaluates machine learning models for emotion detection, addressing challenges in comment-based emotion analysis.
Findings
Inter-annotator agreement achieved was Fleiss kappa = 84.96%.
The optimized MLP model achieved ROC-AUC score of 0.887.
Distinct emotional profiles for different songs were identified.
Abstract
This study introduce GeeSanBhava, a high-quality data set of Sinhala song comments extracted from YouTube manually tagged using Russells Valence-Arousal model by three independent human annotators. The human annotators achieve a substantial inter-annotator agreement (Fleiss kappa = 84.96%). The analysis revealed distinct emotional profiles for different songs, highlighting the importance of comment based emotion mapping. The study also addressed the challenges of comparing comment-based and song-based emotions, mitigating biases inherent in user-generated content. A number of Machine learning and deep learning models were pre-trained on a related large data set of Sinhala News comments in order to report the zero-shot result of our Sinhala YouTube comment data set. An optimized Multi-Layer Perceptron model, after extensive hyperparameter tuning, achieved a ROC-AUC score of 0.887. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Emotion and Mood Recognition · Music and Audio Processing
