On the Domain Robustness of Contrastive Vision-Language Models

Mario Koddenbrock; Rudolf Hoffmann; David Brodmann; Erik Rodner

arXiv:2506.23663·cs.CV·July 1, 2025

On the Domain Robustness of Contrastive Vision-Language Models

Mario Koddenbrock, Rudolf Hoffmann, David Brodmann, Erik Rodner

PDF

Open Access

TL;DR

This paper introduces Deepbench, a framework that uses large language models to generate domain-specific image corruptions, enabling the assessment of the robustness of vision-language models across various real-world domains.

Contribution

The paper presents Deepbench, a novel, open-source framework for evaluating domain-specific robustness of vision-language models using LLM-generated image corruptions.

Findings

01

Significant variability in model robustness across domains.

02

Deepbench effectively generates realistic, domain-specific image corruptions.

03

Open-source release supports further robustness research.

Abstract

In real-world vision-language applications, practitioners increasingly rely on large, pretrained foundation models rather than custom-built solutions, despite limited transparency regarding their training data and processes. While these models achieve impressive performance on general benchmarks, their effectiveness can decline notably under specialized domain shifts, such as unique imaging conditions or environmental variations. In this work, we introduce Deepbench, a framework designed to assess domain-specific robustness of vision-language models (VLMs). Deepbench leverages a large language model (LLM) to generate realistic, context-aware image corruptions tailored to specific deployment domains without requiring labeled data. We evaluate a range of contrastive vision-language architectures and architectural variants across six real-world domains and observe substantial variability…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Domain Adaptation and Few-Shot Learning · Advanced Neural Network Applications