Dataset Creation and Baseline Models for Sexism Detection in Hausa
Fatima Adam Muhammad, Shamsuddeen Muhammad Hassan, Isa Inuwa-Dutse

TL;DR
This paper introduces the first Hausa sexism detection dataset, explores cultural and linguistic nuances through user studies, and evaluates machine learning models, highlighting challenges in capturing cultural context and reducing false positives.
Contribution
It presents the first Hausa sexism dataset, incorporates community insights, and assesses baseline models including few-shot learning for low-resource language sexism detection.
Findings
Challenges in capturing cultural nuances like idiomatic expressions.
Pre-trained models show promise but struggle with false positives.
Community engagement improves dataset relevance.
Abstract
Sexism reinforces gender inequality and social exclusion by perpetuating stereotypes, bias, and discriminatory norms. Noting how online platforms enable various forms of sexism to thrive, there is a growing need for effective sexism detection and mitigation strategies. While computational approaches to sexism detection are widespread in high-resource languages, progress remains limited in low-resource languages where limited linguistic resources and cultural differences affect how sexism is expressed and perceived. This study introduces the first Hausa sexism detection dataset, developed through community engagement, qualitative coding, and data augmentation. For cultural nuances and linguistic representation, we conducted a two-stage user study (n=66) involving native speakers to explore how sexism is defined and articulated in everyday discourse. We further experiment with both…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHate Speech and Cyberbullying Detection · Gender Studies in Language · Social and Intergroup Psychology
