Adaptive Data Collection for Latin-American Community-sourced Evaluation of Stereotypes (LACES)
Guido Ivetta, Pietro Palombini, Sof\'ia Martinelli, Marcos J Gomez, M. Mar\'ia Echeveste, Sunipa Dev, Vinodkumar Prabhakaran, Luciana Benotti

TL;DR
This paper introduces LACES, a new stereotype dataset for Latin America, and a novel adaptive data collection method that enhances stereotype diversity and evaluation of bias mitigation in NLP models.
Contribution
It presents a culturally-specific stereotype dataset for Latin America and a unified adaptive collection methodology that improves data diversity and collection efficiency.
Findings
LACES contains more unique stereotypes than previous datasets.
Debiasing methods are less effective on LACES, indicating regional bias challenges.
The adaptive collection method streamlines stereotype data gathering.
Abstract
The evaluation of societal biases in NLP models is critically hindered by a geo-cultural gap, This leaves regions such as Latin America severely underserved, making it impossible to adequately assess or mitigate the perpetuation of harmful regional stereotypes in language technologies. This paper presents LACES, a stereotype association dataset, for 15 Latin American countries. This dataset includes 4,789 stereotype associations manually created and annotated by 83 participants. The dataset was developed through targeted community partnerships across Latin America. Additionally, in this paper, we propose a novel adaptive data collection methodology that uniquely integrates the sourcing of new stereotype entries and the validation of existing data within a single, unified workflow. This approach results in a resource with more unique stereotypes than previous static collection…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
