CSR-Bench: A Benchmark for Evaluating the Cross-modal Safety and Reliability of MLLMs
Yuxuan Liu, Yuntian Shi, Kun Wang, Haoting Shen, Kun Yang

TL;DR
This paper introduces CSR-Bench, a comprehensive benchmark for assessing the safety and reliability of multimodal large language models across various stress-test scenarios involving images and text.
Contribution
The paper presents CSR-Bench, a novel benchmark with 61 fine-grained tests to evaluate cross-modal safety, revealing systematic gaps and safety challenges in current MLLMs.
Findings
Models exhibit weak safety awareness.
Strong language dominance under interference.
Performance degrades with multimodal inputs.
Abstract
Multimodal large language models (MLLMs) enable interaction over both text and images, but their safety behavior can be driven by unimodal shortcuts instead of true joint intent understanding. We introduce CSR-Bench, a benchmark for evaluating cross-modal reliability through four stress-testing interaction patterns spanning Safety, Over-rejection, Bias, and Hallucination, covering 61 fine-grained types. Each instance is constructed to require integrated image-text interpretation, and we additionally provide paired text-only controls to diagnose modality-induced behavior shifts. We evaluate 16 state-of-the-art MLLMs and observe systematic cross-modal alignment gaps. Models show weak safety awareness, strong language dominance under interference, and consistent performance degradation from text-only controls to multimodal inputs. We also observe a clear trade-off between reducing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Natural Language Processing Techniques
