Revealing emergent human-like conceptual representations from language prediction
Ningyu Xu, Qi Zhang, Chao Du, Qiang Luo, Xipeng Qiu, Xuanjing Huang, Menghan Zhang

TL;DR
This paper demonstrates that large language models develop structured, human-like conceptual representations through language prediction, which align with human behavior and neural activity, despite lacking real-world grounding.
Contribution
It reveals that LLMs form shared, context-independent conceptual structures from language alone, advancing understanding of their internal representations and their relation to human cognition.
Findings
LLMs can derive concepts from linguistic descriptions based on context cues.
Representations in LLMs converge toward a shared, context-independent structure.
These representations predict model performance and align with human neural and behavioral data.
Abstract
People acquire concepts through rich physical and social experiences and use them to understand and navigate the world. In contrast, large language models (LLMs), trained solely through next-token prediction on text, exhibit strikingly human-like behaviors. Are these models developing concepts akin to those of humans? If so, how are such concepts represented, organized, and related to behavior? Here, we address these questions by investigating the representations formed by LLMs during an in-context concept inference task. We found that LLMs can flexibly derive concepts from linguistic descriptions in relation to contextual cues about other concepts. The derived representations converge toward a shared, context-independent structure, and alignment with this structure reliably predicts model performance across various understanding and reasoning tasks. Moreover, the convergent…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
