A Semantic Decoupling-Based Two-Stage Rainy-Day Attack for Revealing Weather Robustness Deficiencies in Vision-Language Models

Chengyin Hu; Xiang Chen; Zhe Jia; Weiwen Shi; Fengyu Zhang; Jiujiang Guo; Yiwei Wei

arXiv:2601.13238·cs.CV·January 21, 2026

A Semantic Decoupling-Based Two-Stage Rainy-Day Attack for Revealing Weather Robustness Deficiencies in Vision-Language Models

Chengyin Hu, Xiang Chen, Zhe Jia, Weiwen Shi, Fengyu Zhang, Jiujiang Guo, Yiwei Wei

PDF

Open Access

TL;DR

This paper introduces a novel two-stage adversarial framework that uses realistic rainy weather perturbations to reveal vulnerabilities in vision-language models, highlighting potential safety risks in real-world scenarios.

Contribution

It is the first to systematically analyze rain-induced semantic shifts in VLMs using a physically grounded, two-stage, semantic decoupling-based adversarial approach.

Findings

01

Rainy weather perturbations cause significant semantic misalignment in VLMs.

02

Illumination changes and multi-scale raindrop structures are key factors in semantic shifts.

03

The framework demonstrates potential safety and reliability risks in real-world applications.

Abstract

Vision-Language Models (VLMs) are trained on image-text pairs collected under canonical visual conditions and achieve strong performance on multimodal tasks. However, their robustness to real-world weather conditions, and the stability of cross-modal semantic alignment under such structured perturbations, remain insufficiently studied. In this paper, we focus on rainy scenarios and introduce the first adversarial framework that exploits realistic weather to attack VLMs, using a two-stage, parameterized perturbation model based on semantic decoupling to analyze rain-induced shifts in decision-making. In Stage 1, we model the global effects of rainfall by applying a low-dimensional global modulation to condition the embedding space and gradually weaken the original semantic decision boundaries. In Stage 2, we introduce structured rain variations by explicitly modeling multi-scale raindrop…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Multimodal Machine Learning Applications · Image Enhancement Techniques