Do Vision-Language Foundational models show Robust Visual Perception?
Shivam Chandhok, Pranav Tandon

TL;DR
This paper investigates whether vision-language foundational models maintain robustness under various real-world distribution shifts like noise and weather effects, comparing their generalization capabilities to human perception.
Contribution
It provides a comprehensive analysis of the robustness of diverse vision-language models under multiple distribution shifts, highlighting their strengths and limitations.
Findings
Models show varying robustness to different corruptions.
Performance degrades significantly under severe shifts.
Insights into generalization capabilities of vision-language models.
Abstract
Recent advances in vision-language foundational models have enabled development of systems that can perform visual understanding and reasoning tasks. However, it is unclear if these models are robust to distribution shifts, and how their performance and generalization capabilities vary under changes in data distribution. In this project we strive to answer the question "Are vision-language foundational models robust to distribution shifts like human perception?" Specifically, we consider a diverse range of vision-language models and compare how the performance of these systems is affected by corruption based distribution shifts (such as \textit{motion blur, fog, snow, gaussian noise}) commonly found in practical real-world scenarios. We analyse the generalization capabilities qualitatively and quantitatively on zero-shot image classification task under aforementioned distribution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Constraint Satisfaction and Optimization · Categorization, perception, and language
