Invisible Relevance Bias: Text-Image Retrieval Models Prefer AI-Generated Images
Shicheng Xu, Danyang Hou, Liang Pang, Jingcheng Deng, Jun Xu, Huawei, Shen, Xueqi Cheng

TL;DR
This paper uncovers an invisible relevance bias in text-image retrieval models favoring AI-generated images over real ones, explores its causes, and proposes a debiasing method to mitigate this bias.
Contribution
The study constructs a benchmark to detect bias, demonstrates the bias's prevalence and exacerbation with AI-generated images, and introduces a training method to reduce the bias.
Findings
AI-generated images are ranked higher despite similar relevance.
Bias is consistent across different models and training data.
Debiasing reduces the ranking bias and reveals embedded information in generated images.
Abstract
With the advancement of generation models, AI-generated content (AIGC) is becoming more realistic, flooding the Internet. A recent study suggests that this phenomenon causes source bias in text retrieval for web search. Specifically, neural retrieval models tend to rank generated texts higher than human-written texts. In this paper, we extend the study of this bias to cross-modal retrieval. Firstly, we successfully construct a suitable benchmark to explore the existence of the bias. Subsequent extensive experiments on this benchmark reveal that AI-generated images introduce an invisible relevance bias to text-image retrieval models. Specifically, our experiments show that text-image retrieval models tend to rank the AI-generated images higher than the real images, even though the AI-generated images do not exhibit more visually relevant features to the query than real images. This…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Image and Video Retrieval Techniques
