VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration

Jiahui Geng; Qing Li; Zongxiong Chen; Yuxia Wang; Derui Zhu; Zhuohan Xie; Chenyang Lyu; Xiuying Chen; Preslav Nakov; Fakhri Karray

arXiv:2505.20362·cs.IR·May 28, 2025

VSCBench: Bridging the Gap in Vision-Language Model Safety Calibration

Jiahui Geng, Qing Li, Zongxiong Chen, Yuxia Wang, Derui Zhu, Zhuohan Xie, Chenyang Lyu, Xiuying Chen, Preslav Nakov, Fakhri Karray

PDF

Open Access 1 Repo

TL;DR

This paper introduces VSCBench, a comprehensive benchmark dataset for evaluating and improving safety calibration in vision-language models, addressing both undersafety and oversafety issues.

Contribution

It presents VSCBench, a new dataset with 3,600 image-text pairs for assessing safety calibration, and evaluates existing models and methods, highlighting challenges and trade-offs.

Findings

01

Existing models exhibit significant undersafety and oversafety issues.

02

Some calibration methods improve safety but reduce model utility.

03

The benchmark enables systematic evaluation of safety calibration approaches.

Abstract

The rapid advancement of vision-language models (VLMs) has brought a lot of attention to their safety alignment. However, existing methods have primarily focused on model undersafety, where the model responds to hazardous queries, while neglecting oversafety, where the model refuses to answer safe queries. In this paper, we introduce the concept of $safety calibration$ , which systematically addresses both undersafety and oversafety. Specifically, we present $VSCBench$ , a novel dataset of 3,600 image-text pairs that are visually or textually similar but differ in terms of safety, which is designed to evaluate safety calibration across image-centric and text-centric scenarios. Based on our benchmark, we evaluate safety calibration across eleven widely used VLMs. Our extensive experiments revealed major issues with both undersafety and oversafety. We further investigated…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

jiahuigeng/vscbench
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Data Processing Techniques · Semantic Web and Ontologies · Fault Detection and Control Systems

MethodsSoftmax · Attention Is All You Need