UniSAFE: A Comprehensive Benchmark for Safety Evaluation of Unified Multimodal Models
Segyu Lee, Boryeong Cho, Hojung Jung, Seokhyun An, Juhyeong Kim, Jaehyun Kwak, Yongjin Yang, Sangwon Jang, Youngrok Park, Wonjun Chang, Se-Young Yun

TL;DR
UniSAFE provides a comprehensive benchmark for evaluating safety risks in unified multimodal models across various tasks and modalities, revealing critical vulnerabilities and emphasizing the need for improved safety alignment.
Contribution
This paper introduces UniSAFE, the first extensive benchmark for system-level safety evaluation of UMMs across multiple modalities and tasks, enabling better safety assessment.
Findings
Current UMMs show significant safety vulnerabilities.
Multi-image and multi-turn tasks are more prone to safety violations.
Image-output tasks are generally more vulnerable than text-output tasks.
Abstract
Unified Multimodal Models (UMMs) offer powerful cross-modality capabilities but introduce new safety risks not observed in single-task models. Despite their emergence, existing safety benchmarks remain fragmented across tasks and modalities, limiting the comprehensive evaluation of complex system-level vulnerabilities. To address this gap, we introduce UniSAFE, the first comprehensive benchmark for system-level safety evaluation of UMMs across 7 I/O modality combinations, spanning conventional tasks and novel multimodal-context image generation settings. UniSAFE is built with a shared-target design that projects common risk scenarios across task-specific I/O configurations, enabling controlled cross-task comparisons of safety failures. Comprising 6,802 curated instances, we use UniSAFE to evaluate 15 state-of-the-art UMMs, both proprietary and open-source. Our results reveal critical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdversarial Robustness in Machine Learning · Safety Systems Engineering in Autonomy · Security and Verification in Computing
