Understanding and Evaluating Racial Biases in Image Captioning
Dora Zhao, Angelina Wang, Olga Russakovsky

TL;DR
This paper investigates racial biases in image captioning systems, revealing disparities in caption quality and sentiment based on skin color, and highlights the increasing bias in modern models compared to older ones.
Contribution
It introduces a manual annotation method for racial bias analysis in image captioning and compares biases across different system versions using the COCO dataset.
Findings
Bias differences are larger in modern captioning systems.
Darker-skinned individuals receive less favorable captions.
Disparities exist in caption sentiment and word choice based on skin color.
Abstract
Image captioning is an important task for benchmarking visual reasoning and for enabling accessibility for people with vision impairments. However, as in many machine learning settings, social biases can influence image captioning in undesirable ways. In this work, we study bias propagation pathways within image captioning, focusing specifically on the COCO dataset. Prior work has analyzed gender bias in captions using automatically-derived gender labels; here we examine racial and intersectional biases using manual annotations. Our first contribution is in annotating the perceived gender and skin color of 28,315 of the depicted people after obtaining IRB approval. Using these annotations, we compare racial biases present in both manual and automatically-generated image captions. We demonstrate differences in caption performance, sentiment, and word choice between images of lighter…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Subtitles and Audiovisual Media
