Hate Speech Detection and Classification in Amharic Text with Deep Learning
Samuel Minale Gashe, Seid Muhie Yimam, Yaregal Assabie

TL;DR
This paper presents a new dataset and a deep learning model for detecting and classifying hate speech in Amharic social media posts, achieving high accuracy and addressing a resource gap.
Contribution
The study introduces the first annotated Amharic hate speech dataset and a specialized deep learning model for classification, filling a resource gap in low-resource language processing.
Findings
Achieved 94.8% F1-score in hate speech classification
Annotated 5,000 Amharic social media posts with four hate speech categories
Demonstrated effectiveness of SBi-LSTM model for low-resource language hate speech detection
Abstract
Hate speech is a growing problem on social media. It can seriously impact society, especially in countries like Ethiopia, where it can trigger conflicts among diverse ethnic and religious groups. While hate speech detection in resource rich languages are progressing, for low resource languages such as Amharic are lacking. To address this gap, we develop Amharic hate speech data and SBi-LSTM deep learning model that can detect and classify text into four categories of hate speech: racial, religious, gender, and non-hate speech. We have annotated 5k Amharic social media post and comment data into four categories. The data is annotated using a custom annotation tool by a total of 100 native Amharic speakers. The model achieves a 94.8 F1-score performance. Future improvements will include expanding the dataset and develop state-of-the art models. Keywords: Amharic hate speech detection,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Language, Linguistics, Cultural Analysis
