Debiasing Large Vision-Language Models by Ablating Protected Attribute   Representations

Neale Ratzlaff; Matthew Lyle Olson; Musashi Hinck; Shao-Yen Tseng,; Vasudev Lal; Phillip Howard

arXiv:2410.13976·cs.CV·October 21, 2024

Debiasing Large Vision-Language Models by Ablating Protected Attribute Representations

Neale Ratzlaff, Matthew Lyle Olson, Musashi Hinck, Shao-Yen Tseng,, Vasudev Lal, Phillip Howard

PDF

Open Access

TL;DR

This paper introduces a simple, training-free method to reduce societal bias in large vision-language models by ablating biased attributes during text generation, maintaining performance while minimizing biased outputs.

Contribution

The authors propose a novel, training-free debiasing framework for LVLMs that directly ablates protected attribute representations during text generation.

Findings

01

Reduces biased attribute mentions in generated text

02

Maintains captioning performance on real datasets

03

Achieves debiasing without sacrificing model accuracy

Abstract

Large Vision Language Models (LVLMs) such as LLaVA have demonstrated impressive capabilities as general-purpose chatbots that can engage in conversations about a provided input image. However, their responses are influenced by societal biases present in their training datasets, leading to undesirable differences in how the model responds when presented with images depicting people of different demographics. In this work, we propose a novel debiasing framework for LVLMs by directly ablating biased attributes during text generation to avoid generating text related to protected attributes, or even representing them internally. Our method requires no training and a relatively small amount of representative biased outputs (~1000 samples). Our experiments show that not only can we can minimize the propensity of LVLMs to generate text related to protected attributes, but we can even use…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques