AfriSenti: A Twitter Sentiment Analysis Benchmark for African Languages
Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Abinew Ali Ayele,, Nedjma Ousidhoum, David Ifeoluwa Adelani, Seid Muhie Yimam, Ibrahim Sa'id, Ahmad, Meriem Beloucif, Saif M. Mohammad, Sebastian Ruder, Oumaima Hourrane,, Pavel Brazdil, Felermino D\'ario M\'ario Ant\'onio Ali

TL;DR
This paper introduces AfriSenti, a comprehensive sentiment analysis benchmark dataset of over 110,000 tweets in 14 African languages, aiming to advance NLP research for Africa's diverse linguistic landscape.
Contribution
It provides the first large-scale, annotated sentiment dataset for multiple African languages, along with data collection, annotation methodology, and baseline experiments.
Findings
Over 200 participants in the shared task.
Baseline models show varying performance across languages.
The dataset facilitates future NLP research on African languages.
Abstract
Africa is home to over 2,000 languages from more than six language families and has the highest linguistic diversity among all continents. These include 75 languages with at least one million speakers each. Yet, there is little NLP research conducted on African languages. Crucial to enabling such research is the availability of high-quality annotated datasets. In this paper, we introduce AfriSenti, a sentiment analysis benchmark that contains a total of >110,000 tweets in 14 African languages (Amharic, Algerian Arabic, Hausa, Igbo, Kinyarwanda, Moroccan Arabic, Mozambican Portuguese, Nigerian Pidgin, Oromo, Swahili, Tigrinya, Twi, Xitsonga, and Yor\`ub\'a) from four language families. The tweets were annotated by native speakers and used in the AfriSenti-SemEval shared task (The AfriSenti Shared Task had over 200 participants. See website at https://afrisenti-semeval.github.io). We…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Natural Language Processing Techniques · Linguistics and Discourse Analysis
