Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

Caixin Kang; Tianyu Yan; Sitong Gong; Mingfang Zhang; Liangyang Ouyang; Ruicong Liu; Bo Zheng; Huchuan Lu; Kaipeng Zhang; Yoichi Sato; Yifei Huang

arXiv:2605.22109·cs.AI·May 22, 2026

Perception or Prejudice: Can MLLMs Go Beyond First Impressions of Personality?

Caixin Kang, Tianyu Yan, Sitong Gong, Mingfang Zhang, Liangyang Ouyang, Ruicong Liu, Bo Zheng, Huchuan Lu, Kaipeng Zhang, Yoichi Sato, Yifei Huang

PDF

1 Repo 1 Datasets

TL;DR

This paper introduces Grounded Personality Reasoning (GPR), a new dataset and benchmark to evaluate whether Multimodal Large Language Models genuinely understand personality traits through behavioral evidence or rely on superficial cues.

Contribution

It formalizes GPR as a new task, releases MM-OCEAN dataset with behavioral observations, and benchmarks 27 MLLMs to analyze their reasoning and grounding capabilities.

Findings

01

51% of correct ratings are not grounded in cues

02

Holistic-Grounding Rate ranges only from 0 to 33.5%

03

Significant Prejudice Gap in model reasoning

Abstract

Multimodal Large Language Models (MLLMs) are increasingly deployed in human-facing roles where personality perception is critical, yet existing benchmarks evaluate this capability solely on numerical Big Five score prediction, leaving open whether models truly perceive personality through behavioral understanding or merely prejudge through superficial pattern matching. We address this gap with three contributions. (i) A new task: we formalize Grounded Personality Reasoning (GPR), which requires MLLMs to anchor each Big Five rating in observable evidence through a chain of rating, reasoning, and grounding. (ii) A new dataset: we release MM-OCEAN (1,104 videos, 5,320 MCQs), produced by a multi-agent pipeline with human verification, with timestamped behavioral observations, evidence-grounded trait analyses, and seven categories of cue-grounding MCQs. (iii) Benchmark and analysis: we…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kkkcx/MM-OCEAN
github

Datasets

anonymous-mm-ocean/MM-OCEAN
dataset· 1.5k dl
1.5k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.