TL;DR
Lingua-SafetyBench is a new comprehensive benchmark with over 100,000 harmful multilingual image-text pairs designed to evaluate and improve the safety of Vision-Language Large Models across diverse languages and modalities.
Contribution
It introduces a large-scale, multilingual, multimodal safety benchmark with data partitioning to analyze vulnerabilities and disparities in current VLLMs.
Findings
Current VLLMs show significant vulnerabilities under joint multilingual and multimodal inputs.
Risks are higher in Non-High-Resource Languages and non-Latin scripts.
Model scaling improves safety more for High-Resource Languages, increasing disparities.
Abstract
The robust safety of Vision-Language Large Models (VLLMs) against joint multilingual and multimodal threats remains severely underexplored. Current benchmarks typically isolate these dimensions, being either multilingual but text-only, or multimodal but monolingual. While recent red-teaming efforts attempt to bridge this gap by rendering harmful prompts as images, their overreliance on typography-style visuals and lack of semantically grounded image-text pairs fail to capture realistic cross-modal interactions under multilingual and multimodal conditions. To address this, we introduce Lingua-SafetyBench, a comprehensive benchmark of 100,440 harmful image-text pairs spanning 10 languages. Crucially, Lingua-SafetyBench explicitly partitions data into image-dominant and text-dominant subsets to precisely disentangle sources of risk. Extensive evaluations reveal that current VLLMs retain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
