Human-like object concept representations emerge naturally in multimodal large language models

Changde Du; Kaicheng Fu; Bincheng Wen; Yi Sun; Jie Peng; Wei Wei; Ying Gao; Shengpei Wang; Chuncheng Zhang; Jinpeng Li; Shuang Qiu; Le Chang; and Huiguang He

arXiv:2407.01067·cs.AI·June 12, 2025

Human-like object concept representations emerge naturally in multimodal large language models

Changde Du, Kaicheng Fu, Bincheng Wen, Yi Sun, Jie Peng, Wei Wei, Ying Gao, Shengpei Wang, Chuncheng Zhang, Jinpeng Li, Shuang Qiu, Le Chang, and Huiguang He

PDF

TL;DR

This study demonstrates that multimodal large language models naturally develop human-like object concept representations, aligning with neural patterns and offering insights into perception, cognition, and artificial intelligence development.

Contribution

It shows that LLMs and MLLMs can form stable, interpretable, human-like object representations from multimodal data, bridging AI models and human cognition.

Findings

01

Embeddings capture semantic clustering similar to human mental representations

02

Strong alignment between model embeddings and neural activity in key brain regions

03

Developed low-dimensional, interpretable object concept embeddings

Abstract

Understanding how humans conceptualize and categorize natural objects offers critical insights into perception and cognition. With the advent of Large Language Models (LLMs), a key question arises: can these models develop human-like object representations from linguistic and multimodal data? In this study, we combined behavioral and neuroimaging analyses to explore the relationship between object concept representations in LLMs and human cognition. We collected 4.7 million triplet judgments from LLMs and Multimodal LLMs (MLLMs) to derive low-dimensional embeddings that capture the similarity structure of 1,854 natural objects. The resulting 66-dimensional embeddings were stable, predictive, and exhibited semantic clustering similar to human mental representations. Remarkably, the dimensions underlying these embeddings were interpretable, suggesting that LLMs and MLLMs develop…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.