Measuring Representational Harms in Image Captioning
Angelina Wang, Solon Barocas, Kristen Laird, Hanna Wallach

TL;DR
This paper develops and applies measurement techniques to quantify five types of representational harms in image captioning datasets and models, highlighting the complexity and challenges of assessing fairness beyond bias.
Contribution
It introduces multiple normatively grounded measurement methods for different harms, enhancing the validity of fairness assessments in image captioning systems.
Findings
Measured representational harms in popular datasets
Identified challenges in harm measurement assumptions
Provided insights into the multi-faceted nature of harms
Abstract
Previous work has largely considered the fairness of image captioning systems through the underspecified lens of "bias." In contrast, we present a set of techniques for measuring five types of representational harms, as well as the resulting measurements obtained for two of the most popular image captioning datasets using a state-of-the-art image captioning system. Our goal was not to audit this image captioning system, but rather to develop normatively grounded measurement techniques, in turn providing an opportunity to reflect on the many challenges involved. We propose multiple measurement techniques for each type of harm. We argue that by doing so, we are better able to capture the multi-faceted nature of each type of harm, in turn improving the (collective) validity of the resulting measurements. Throughout, we discuss the assumptions underlying our measurement approach and point…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Law in Society and Culture
