TL;DR
This paper addresses extrinsic gender bias in Bangla language models by creating benchmark datasets, proposing a novel debiasing method called RandSymKL, and demonstrating its effectiveness while maintaining accuracy.
Contribution
It introduces a new bias mitigation technique, RandSymKL, tailored for Bangla classification tasks, along with benchmark datasets for evaluating gender bias.
Findings
RandSymKL effectively reduces extrinsic gender bias.
The approach maintains competitive classification accuracy.
Benchmark datasets facilitate future bias research in Bangla.
Abstract
In this study, we investigate extrinsic gender bias in Bangla pretrained language models, a largely underexplored area in low-resource languages. To assess this bias, we construct four manually annotated, task-specific benchmark datasets for sentiment analysis, toxicity detection, hate speech detection, and sarcasm detection. Each dataset is augmented using nuanced gender perturbations, where we systematically swap gendered names and terms while preserving semantic content, enabling minimal-pair evaluation of gender-driven prediction shifts. We then propose RandSymKL, a randomized debiasing strategy integrated with symmetric KL divergence and cross-entropy loss to mitigate the bias across task-specific pretrained models. RandSymKL is a refined training approach to integrate these elements in a unified way for extrinsic gender bias mitigation focused on classification tasks. Our approach…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
