A Comprehensive Framework to Operationalize Social Stereotypes for Responsible AI Evaluations
Aida Davani, Sunipa Dev, H\'ector P\'erez-Urbina, Vinodkumar Prabhakaran

TL;DR
This paper proposes a unified framework for operationalizing social stereotypes in AI evaluations, integrating social psychological insights with NLP methods to improve responsible AI practices.
Contribution
It introduces a comprehensive framework that captures key stereotype components, enabling more holistic and effective responsible AI evaluations.
Findings
Framework identifies target group, attributes, relationships, perceivers, and context.
Provides guidelines for responsible use of stereotype operationalization.
Enhances understanding of stereotypes' impact on AI outcomes.
Abstract
Societal stereotypes are at the center of a myriad of responsible AI interventions targeted at reducing the generation and propagation of potentially harmful outcomes. While these efforts are much needed, they tend to be fragmented and often address different parts of the issue without adopting a unified or holistic approach to social stereotypes and how they impact various parts of the machine learning pipeline. As a result, current interventions fail to capitalize on the underlying mechanisms that are common across different types of stereotypes, and to anchor on particular aspects that are relevant in certain cases. In this paper, we draw on social psychological research and build on NLP data and methods, to propose a unified framework to operationalize stereotypes in generative AI evaluations. Our framework identifies key components of stereotypes that are crucial in AI evaluation,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsEthics and Social Impacts of AI
