MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe   Queries?

Xirui Li; Hengguang Zhou; Ruochen Wang; Tianyi Zhou; Minhao Cheng,; Cho-Jui Hsieh

arXiv:2406.17806·cs.CL·June 27, 2024

MOSSBench: Is Your Multimodal Language Model Oversensitive to Safe Queries?

Xirui Li, Hengguang Zhou, Ruochen Wang, Tianyi Zhou, Minhao Cheng,, Cho-Jui Hsieh

PDF

Open Access 1 Datasets

TL;DR

This paper reveals that advanced multimodal language models often reject harmless queries due to oversensitivity triggered by specific visual stimuli, highlighting a need for improved safety mechanisms.

Contribution

It introduces MOSSBench, a benchmark with 300 benign multimodal queries, to systematically evaluate oversensitivity in 20 state-of-the-art MLLMs, revealing prevalent overcaution issues.

Findings

01

Oversensitivity is common, with refusal rates up to 76%.

02

Safer models tend to be more oversensitive.

03

Different stimuli cause errors at perception, reasoning, and safety judgment stages.

Abstract

Humans are prone to cognitive distortions -- biased thinking patterns that lead to exaggerated responses to specific stimuli, albeit in very different contexts. This paper demonstrates that advanced Multimodal Large Language Models (MLLMs) exhibit similar tendencies. While these models are designed to respond queries under safety mechanism, they sometimes reject harmless queries in the presence of certain visual stimuli, disregarding the benign nature of their contexts. As the initial step in investigating this behavior, we identify three types of stimuli that trigger the oversensitivity of existing MLLMs: Exaggerated Risk, Negated Harm, and Counterintuitive Interpretation. To systematically evaluate MLLMs' oversensitivity to these stimuli, we propose the Multimodal OverSenSitivity Benchmark (MOSSBench). This toolkit consists of 300 manually collected benign multimodal queries,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

AIcell/MOSSBench
dataset· 323 dl
323 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Semantic Web and Ontologies · Topic Modeling