Safe Inputs but Unsafe Output: Benchmarking Cross-modality Safety Alignment of Large Vision-Language Model
Siyin Wang, Xingsong Ye, Qinyuan Cheng, Junwen Duan, Shimin Li, Jinlan, Fu, Xipeng Qiu, Xuanjing Huang

TL;DR
This paper introduces the SIUO benchmark to evaluate cross-modality safety alignment in large vision-language models, revealing significant safety vulnerabilities and highlighting the need for improved safety measures in multi-modal AI systems.
Contribution
It presents a novel safety challenge and benchmark for cross-modality safety, addressing a gap in existing safety evaluations for multi-modal AI models.
Findings
Substantial safety vulnerabilities found in current LVLMs
Current models struggle with complex, real-world safety scenarios
Benchmark covers 9 critical safety domains
Abstract
As Artificial General Intelligence (AGI) becomes increasingly integrated into various facets of human life, ensuring the safety and ethical alignment of such systems is paramount. Previous studies primarily focus on single-modality threats, which may not suffice given the integrated and complex nature of cross-modality interactions. We introduce a novel safety alignment challenge called Safe Inputs but Unsafe Output (SIUO) to evaluate cross-modality safety alignment. Specifically, it considers cases where single modalities are safe independently but could potentially lead to unsafe or unethical outputs when combined. To empirically investigate this problem, we developed the SIUO, a cross-modality benchmark encompassing 9 critical safety domains, such as self-harm, illegal activities, and privacy violations. Our findings reveal substantial safety vulnerabilities in both closed- and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSafety Systems Engineering in Autonomy · Risk and Safety Analysis · Software Reliability and Analysis Research
MethodsFocus
