MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal   Large Language Models

Tianle Gu; Zeyang Zhou; Kexin Huang; Dandan Liang; Yixu Wang; Haiquan; Zhao; Yuanqi Yao; Xingge Qiao; Keqing Wang; Yujiu Yang; Yan Teng; Yu Qiao,; Yingchun Wang

arXiv:2406.07594·cs.CL·June 18, 2024

MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models

Tianle Gu, Zeyang Zhou, Kexin Huang, Dandan Liang, Yixu Wang, Haiquan, Zhao, Yuanqi Yao, Xingge Qiao, Keqing Wang, Yujiu Yang, Yan Teng, Yu Qiao,, Yingchun Wang

PDF

Open Access 1 Repo 2 Datasets 1 Video

TL;DR

MLLMGuard is a comprehensive safety evaluation suite for Multimodal Large Language Models, covering multiple safety dimensions and languages, with a new dataset and an automated evaluator to improve safety assessment accuracy.

Contribution

The paper introduces MLLMGuard, a multidimensional safety evaluation framework with a bilingual dataset and a lightweight evaluator, addressing gaps in existing safety benchmarks for MLLMs.

Findings

01

MLLMs still have significant safety challenges.

02

MLLMGuard's evaluator outperforms GPT-4V in accuracy.

03

Evaluation across 13 models reveals safety issues.

Abstract

Powered by remarkable advancements in Large Language Models (LLMs), Multimodal Large Language Models (MLLMs) demonstrate impressive capabilities in manifold tasks. However, the practical application scenarios of MLLMs are intricate, exposing them to potential malicious instructions and thereby posing safety risks. While current benchmarks do incorporate certain safety considerations, they often lack comprehensive coverage and fail to exhibit the necessary rigor and robustness. For instance, the common practice of employing GPT-4V as both the evaluator and a model to be evaluated lacks credibility, as it tends to exhibit a bias toward its own responses. In this paper, we present MLLMGuard, a multidimensional safety evaluation suite for MLLMs, including a bilingual image-text evaluation dataset, inference utilities, and a lightweight evaluator. MLLMGuard's assessment comprehensively…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Carol-gutianle/MLLMGuard
pytorchOfficial

Datasets

Videos

MLLMGuard: A Multi-dimensional Safety Evaluation Suite for Multimodal Large Language Models· slideslive

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsResidual Connection · Softmax · Layer Normalization · Byte Pair Encoding · Label Smoothing · Adam · Attention Is All You Need · Linear Layer · Multi-Head Attention · Position-Wise Feed-Forward Layer