Debiasing Vision-Language Models via Biased Prompts
Ching-Yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba,, Stefanie Jegelka

TL;DR
This paper introduces a method to reduce social biases in vision-language models by projecting out biased directions in text embeddings, improving fairness and robustness without extra data or training.
Contribution
The authors propose a simple, closed-form debiasing technique for vision-language models that effectively minimizes biases in both discriminative and generative tasks.
Findings
Reduces social bias and spurious correlations in models
Effective for both classification and generation tasks
No additional data or training required
Abstract
Machine learning models have been shown to inherit biases from their training datasets. This can be particularly problematic for vision-language foundation models trained on uncurated datasets scraped from the internet. The biases can be amplified and propagated to downstream applications like zero-shot classifiers and text-to-image generative models. In this study, we propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding. In particular, we show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models. The proposed closed-form solution enables easy integration into large-scale pipelines, and empirical results demonstrate that our approach effectively reduces social bias and spurious correlation in both discriminative and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Domain Adaptation and Few-Shot Learning
