Talking to the brain: Using Large Language Models as Proxies to Model Brain Semantic Representation
Xin Liu, Ziyue Zhang, Jingxin Nie

TL;DR
This paper presents a novel approach using multimodal large language models as proxies to analyze human brain semantic representations from naturalistic images, validated by neural activity patterns and revealing hierarchical organization.
Contribution
It introduces a new paradigm leveraging LLMs for extracting semantic information from naturalistic stimuli to study brain organization, overcoming traditional annotation challenges.
Findings
LLM-derived representations predict neural activity patterns.
Hierarchical semantic organization across cortical regions identified.
Brain semantic network reveals meaningful functional clusters.
Abstract
Traditional psychological experiments utilizing naturalistic stimuli face challenges in manual annotation and ecological validity. To address this, we introduce a novel paradigm leveraging multimodal large language models (LLMs) as proxies to extract rich semantic information from naturalistic images through a Visual Question Answering (VQA) strategy for analyzing human visual semantic representation. LLM-derived representations successfully predict established neural activity patterns measured by fMRI (e.g., faces, buildings), validating its feasibility and revealing hierarchical semantic organization across cortical regions. A brain semantic network constructed from LLM-derived representations identifies meaningful clusters reflecting functional and contextual associations. This innovative methodology offers a powerful solution for investigating brain semantic organization with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace Recognition and Perception · Neurobiology of Language and Bilingualism · Multimodal Machine Learning Applications
