Semantic Misalignment in Vision-Language Models under Perceptual Degradation

Guo Cheng

arXiv:2601.08355·cs.CV·January 16, 2026

Semantic Misalignment in Vision-Language Models under Perceptual Degradation

Guo Cheng

PDF

Open Access

TL;DR

This paper investigates how vision-language models fail under perceptual degradation, revealing that small visual corruptions can cause significant semantic misalignments affecting safety-critical decisions.

Contribution

It introduces perception-realistic corruptions and language-level metrics to systematically evaluate semantic misalignment in VLMs under perception degradation.

Findings

01

VLMs exhibit severe semantic failures with moderate perception drops

02

Current robustness metrics do not predict semantic misalignments

03

Highlighting the disconnect between perception quality and semantic reliability

Abstract

Vision-Language Models (VLMs) are increasingly deployed in autonomous driving and embodied AI systems, where reliable perception is critical for safe semantic reasoning and decision-making. While recent VLMs demonstrate strong performance on multimodal benchmarks, their robustness to realistic perception degradation remains poorly understood. In this work, we systematically study semantic misalignment in VLMs under controlled degradation of upstream visual perception, using semantic segmentation on the Cityscapes dataset as a representative perception module. We introduce perception-realistic corruptions that induce only moderate drops in conventional segmentation metrics, yet observe severe failures in downstream VLM behavior, including hallucinated object mentions, omission of safety-critical entities, and inconsistent safety judgments. To quantify these effects, we propose a set of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Adversarial Robustness in Machine Learning · Ethics and Social Impacts of AI