Can MLLMs Read the Room? A Multimodal Benchmark for Assessing Deception in Multi-Party Social Interactions

Caixin Kang; Yifei Huang; Liangyang Ouyang; Mingfang Zhang; Ruicong Liu; Yoichi Sato

arXiv:2511.16221·cs.CV·November 21, 2025

Can MLLMs Read the Room? A Multimodal Benchmark for Assessing Deception in Multi-Party Social Interactions

Caixin Kang, Yifei Huang, Liangyang Ouyang, Mingfang Zhang, Ruicong Liu, Yoichi Sato

PDF

Open Access

TL;DR

This paper introduces a new benchmark and dataset to evaluate multimodal large language models' ability to read social cues and assess deception, revealing current models' limitations and proposing new reasoning modules to improve social understanding.

Contribution

The paper presents the MIDA benchmark, a novel dataset, and a Social Chain-of-Thought reasoning pipeline with a Dynamic Social Epistemic Memory module to enhance social reasoning in MLLMs.

Findings

01

Current MLLMs struggle to distinguish truth from falsehood in social interactions.

02

Models lack effective grounding in multimodal social cues.

03

Proposed modules improve performance on deception assessment.

Abstract

Despite their advanced reasoning capabilities, state-of-the-art Multimodal Large Language Models (MLLMs) demonstrably lack a core component of human intelligence: the ability to `read the room' and assess deception in complex social interactions. To rigorously quantify this failure, we introduce a new task, Multimodal Interactive Deception Assessment (MIDA), and present a novel multimodal dataset providing synchronized video and text with verifiable ground-truth labels for every statement. We establish a comprehensive benchmark evaluating 12 state-of-the-art open- and closed-source MLLMs, revealing a significant performance gap: even powerful models like GPT-4o struggle to distinguish truth from falsehood reliably. Our analysis of failure modes indicates that these models fail to effectively ground language in multimodal social cues and lack the ability to model what others know,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDeception detection and forensic psychology · Topic Modeling · Explainable Artificial Intelligence (XAI)