Large language models converge toward human-like concept organization
Mathias Lykke Gammelgaard, Jonathan Gabel Christiansen, Anders, S{\o}gaard

TL;DR
Large language models organize concepts in ways similar to structured knowledge bases, and this organization improves with larger, more advanced models, suggesting they learn human-like inferential semantics from raw text.
Contribution
The paper demonstrates that large language models develop concept organization akin to knowledge bases, indicating they learn inferential semantics beyond mere pattern matching.
Findings
Larger models show more human-like concept organization.
Concept organization aligns with knowledge base structures.
Model size correlates with semantic understanding.
Abstract
Large language models show human-like performance in knowledge extraction, reasoning and dialogue, but it remains controversial whether this performance is best explained by memorization and pattern matching, or whether it reflects human-like inferential semantics and world knowledge. Knowledge bases such as WikiData provide large-scale, high-quality representations of inferential semantics and world knowledge. We show that large language models learn to organize concepts in ways that are strikingly similar to how concepts are organized in such knowledge bases. Knowledge bases model collective, institutional knowledge, and large language models seem to induce such knowledge from raw text. We show that bigger and better models exhibit more human-like concept organization, across four families of language models and three knowledge graph embeddings.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Advanced Graph Neural Networks
