Can MLLMs Read the Room? A Multimodal Benchmark for Verifying Truthfulness in Multi-Party Social Interactions
Caixin Kang, Yifei Huang, Liangyang Ouyang, Mingfang Zhang, Yoichi Sato

TL;DR
This paper introduces a new benchmark for evaluating multimodal large language models' ability to verify truthfulness in multi-party social interactions, revealing current models' limitations in understanding visual social cues and deception detection.
Contribution
The paper presents a novel multimodal dataset and task for assessing MLLMs' social intelligence in deception detection, highlighting significant performance gaps and failure modes.
Findings
State-of-the-art MLLMs perform poorly on truth verification.
Models struggle to ground language in visual social cues.
Current models are overly conservative and lack perceptiveness.
Abstract
As AI systems become increasingly integrated into human lives, endowing them with robust social intelligence has emerged as a critical frontier. A key aspect of this intelligence is discerning truth from deception, a ubiquitous element of human interaction that is conveyed through a complex interplay of verbal language and non-verbal visual cues. However, automatic deception detection in dynamic, multi-party conversations remains a significant challenge. The recent rise of powerful Multimodal Large Language Models (MLLMs), with their impressive abilities in visual and textual understanding, makes them natural candidates for this task. Consequently, their capabilities in this crucial domain are mostly unquantified. To address this gap, we introduce a new task, Multimodal Interactive Veracity Assessment (MIVA), and present a novel multimodal dataset derived from the social deduction game…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDeception detection and forensic psychology · Multimodal Machine Learning Applications · Explainable Artificial Intelligence (XAI)
