BESSTIE: A Benchmark for Sentiment and Sarcasm Classification for Varieties of English
Dipankar Srirag, Aditya Joshi, Jordan Painter, Diptesh Kanojia

TL;DR
BESSTIE introduces a new benchmark dataset for sentiment and sarcasm classification across Australian, Indian, and British English varieties, addressing bias and generalization challenges in large language models.
Contribution
The paper presents BESSTIE, a novel labeled dataset for sentiment and sarcasm detection in diverse English varieties, along with evaluation of LLM performance and analysis of language variety-specific challenges.
Findings
Models perform better on en-AU and en-UK than en-IN.
Sarcasm classification is more challenging across varieties.
Cross-variety generalization remains a significant challenge.
Abstract
Despite large language models (LLMs) being known to exhibit bias against non-standard language varieties, there are no known labelled datasets for sentiment analysis of English. To address this gap, we introduce BESSTIE, a benchmark for sentiment and sarcasm classification for three varieties of English: Australian (en-AU), Indian (en-IN), and British (en-UK). We collect datasets for these language varieties using two methods: location-based for Google Places reviews, and topic-based filtering for Reddit comments. To assess whether the dataset accurately represents these varieties, we conduct two validation steps: (a) manual annotation of language varieties and (b) automatic language variety prediction. Native speakers of the language varieties manually annotate the datasets with sentiment and sarcasm labels. We perform an additional annotation exercise to validate the reliance of the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAuthorship Attribution and Profiling
