Measuring Machine Learning Harms from Stereotypes Requires Understanding Who Is Harmed by Which Errors in What Ways
Angelina Wang, Xuechunzi Bai, Solon Barocas, Su Lin Blodgett

TL;DR
This study investigates how different types of machine learning errors, especially those related to stereotypes, cause varying levels of experiential harm, emphasizing the importance of understanding who is harmed and how in fairness assessments.
Contribution
It introduces a human-centered approach to measuring ML harms by integrating social psychology insights and experimental data on stereotype-related errors.
Findings
Stereotype-reinforcing errors cause more experiential harm.
Harm varies by gender and error type, affecting women more.
Certain stereotype-violating errors are more harmful to men.
Abstract
As machine learning applications proliferate, we need an understanding of their potential for harm. However, current fairness metrics are rarely grounded in human psychological experiences of harm. Drawing on the social psychology of stereotypes, we use a case study of gender stereotypes in image search to examine how people react to machine learning errors. First, we use survey studies to show that not all machine learning errors reflect stereotypes nor are equally harmful. Then, in experimental studies we randomly expose participants to stereotype-reinforcing, -violating, and -neutral machine learning errors. We find stereotype-reinforcing errors induce more experientially (i.e., subjectively) harmful experiences, while having minimal changes to cognitive beliefs, attitudes, or behaviors. This experiential harm impacts women more than men. However, certain stereotype-violating errors…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI
