Anubhuti -- An annotated dataset for emotional analysis of Bengali short stories
Aditya Pal, Bhaskar Karn

TL;DR
This paper introduces Anubhuti, the first large annotated dataset of Bengali short stories for emotion analysis, created through meticulous data collection and annotation, enabling effective machine learning models for emotion classification.
Contribution
The paper presents the creation of Anubhuti, a comprehensive and high-quality Bengali emotion dataset, along with baseline models demonstrating its utility for emotion analysis.
Findings
High inter-annotator agreement due to expert annotation
Baseline models achieve high accuracy on emotion classification
Dataset facilitates linguistic and data analysis of Bengali emotional expression
Abstract
Thousands of short stories and articles are being written in many different languages all around the world today. Bengali, or Bangla, is the second highest spoken language in India after Hindi and is the national language of the country of Bangladesh. This work reports in detail the creation of Anubhuti -- the first and largest text corpus for analyzing emotions expressed by writers of Bengali short stories. We explain the data collection methods, the manual annotation process and the resulting high inter-annotator agreement of the dataset due to the linguistic expertise of the annotators and the clear methodology of labelling followed. We also address some of the challenges faced in the collection of raw data and annotation process of a low resource language like Bengali. We have verified the performance of our dataset with baseline Machine Learning as well as a Deep Learning model for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Topic Modeling · Advanced Text Analysis Techniques
MethodsFeature Selection
