MTMCS-Bench: Evaluating Contextual Safety of Multimodal Large Language Models in Multi-Turn Dialogues

Zheyuan Liu; Dongwhi Kim; Yixin Wan; Xiangchi Yuan; Zhaoxuan Tan; Fengran Mo; Meng Jiang

arXiv:2601.06757·cs.CL·January 13, 2026

MTMCS-Bench: Evaluating Contextual Safety of Multimodal Large Language Models in Multi-Turn Dialogues

Zheyuan Liu, Dongwhi Kim, Yixin Wan, Xiangchi Yuan, Zhaoxuan Tan, Fengran Mo, Meng Jiang

PDF

Open Access

TL;DR

This paper introduces MTMCS-Bench, a comprehensive benchmark for assessing the contextual safety of multimodal large language models in multi-turn dialogues, addressing the gradual emergence of malicious intent and context-switch risks.

Contribution

The paper presents MTMCS-Bench, a new benchmark with over 30,000 samples for evaluating safety in multimodal models across realistic multi-turn interactions and risk scenarios.

Findings

01

Models show trade-offs between safety and utility.

02

Guardrails mitigate some risks but are not fully effective.

03

Persistent safety challenges remain in multi-turn multimodal dialogues.

Abstract

Multimodal large language models (MLLMs) are increasingly deployed as assistants that interact through text and images, making it crucial to evaluate contextual safety when risk depends on both the visual scene and the evolving dialogue. Existing contextual safety benchmarks are mostly single-turn and often miss how malicious intent can emerge gradually or how the same scene can support both benign and exploitative goals. We introduce the Multi-Turn Multimodal Contextual Safety Benchmark (MTMCS-Bench), a benchmark of realistic images and multi-turn conversations that evaluates contextual safety in MLLMs under two complementary settings, escalation-based risk and context-switch risk. MTMCS-Bench offers paired safe and unsafe dialogues with structured evaluation. It contains over 30 thousand multimodal (image+text) and unimodal (text-only) samples, with metrics that separately measure…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Topic Modeling · Adversarial Robustness in Machine Learning