Adaptive Data Collection for Latin-American Community-sourced Evaluation of Stereotypes (LACES)

Guido Ivetta; Pietro Palombini; Sof\'ia Martinelli; Marcos J Gomez; M. Mar\'ia Echeveste; Sunipa Dev; Vinodkumar Prabhakaran; Luciana Benotti

arXiv:2510.24958·cs.CY·January 7, 2026

Adaptive Data Collection for Latin-American Community-sourced Evaluation of Stereotypes (LACES)

Guido Ivetta, Pietro Palombini, Sof\'ia Martinelli, Marcos J Gomez, M. Mar\'ia Echeveste, Sunipa Dev, Vinodkumar Prabhakaran, Luciana Benotti

PDF

1 Datasets

TL;DR

This paper introduces LACES, a new stereotype dataset for Latin America, and a novel adaptive data collection method that enhances stereotype diversity and evaluation of bias mitigation in NLP models.

Contribution

It presents a culturally-specific stereotype dataset for Latin America and a unified adaptive collection methodology that improves data diversity and collection efficiency.

Findings

01

LACES contains more unique stereotypes than previous datasets.

02

Debiasing methods are less effective on LACES, indicating regional bias challenges.

03

The adaptive collection method streamlines stereotype data gathering.

Abstract

The evaluation of societal biases in NLP models is critically hindered by a geo-cultural gap, This leaves regions such as Latin America severely underserved, making it impossible to adequately assess or mitigate the perpetuation of harmful regional stereotypes in language technologies. This paper presents LACES, a stereotype association dataset, for 15 Latin American countries. This dataset includes 4,789 stereotype associations manually created and annotated by 83 participants. The dataset was developed through targeted community partnerships across Latin America. Additionally, in this paper, we propose a novel adaptive data collection methodology that uniquely integrates the sourcing of new stereotype entries and the validation of existing data within a single, unified workflow. This approach results in a resource with more unique stereotypes than previous static collection…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

vialibre/LACES
dataset· 20 dl
20 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.