UniSAFE: A Comprehensive Benchmark for Safety Evaluation of Unified Multimodal Models

Segyu Lee; Boryeong Cho; Hojung Jung; Seokhyun An; Juhyeong Kim; Jaehyun Kwak; Yongjin Yang; Sangwon Jang; Youngrok Park; Wonjun Chang; Se-Young Yun

arXiv:2603.17476·cs.CV·March 19, 2026

UniSAFE: A Comprehensive Benchmark for Safety Evaluation of Unified Multimodal Models

Segyu Lee, Boryeong Cho, Hojung Jung, Seokhyun An, Juhyeong Kim, Jaehyun Kwak, Yongjin Yang, Sangwon Jang, Youngrok Park, Wonjun Chang, Se-Young Yun

PDF

Open Access

TL;DR

UniSAFE provides a comprehensive benchmark for evaluating safety risks in unified multimodal models across various tasks and modalities, revealing critical vulnerabilities and emphasizing the need for improved safety alignment.

Contribution

This paper introduces UniSAFE, the first extensive benchmark for system-level safety evaluation of UMMs across multiple modalities and tasks, enabling better safety assessment.

Findings

01

Current UMMs show significant safety vulnerabilities.

02

Multi-image and multi-turn tasks are more prone to safety violations.

03

Image-output tasks are generally more vulnerable than text-output tasks.

Abstract

Unified Multimodal Models (UMMs) offer powerful cross-modality capabilities but introduce new safety risks not observed in single-task models. Despite their emergence, existing safety benchmarks remain fragmented across tasks and modalities, limiting the comprehensive evaluation of complex system-level vulnerabilities. To address this gap, we introduce UniSAFE, the first comprehensive benchmark for system-level safety evaluation of UMMs across 7 I/O modality combinations, spanning conventional tasks and novel multimodal-context image generation settings. UniSAFE is built with a shared-target design that projects common risk scenarios across task-specific I/O configurations, enabling controlled cross-task comparisons of safety failures. Comprising 6,802 curated instances, we use UniSAFE to evaluate 15 state-of-the-art UMMs, both proprietary and open-source. Our results reveal critical…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Safety Systems Engineering in Autonomy · Security and Verification in Computing