TL;DR
This study shows that language models trained on different languages and human brains listening to stories in various languages share a common conceptual space, revealing universal neural representations of meaning.
Contribution
It demonstrates that both language models and human neural responses converge on a shared conceptual space across different languages and speakers.
Findings
Language models in different languages converge in a similar embedding space.
Neural responses to stories are predictable across languages using models trained on other languages.
Shared neural representations of meaning exist across speakers of different languages.
Abstract
Human languages differ widely in their forms, each having distinct sounds, scripts, and syntax. Yet, they can all convey similar meaning. Do different languages converge on a shared neural substrate for conceptual meaning? We used language models (LMs) and naturalistic fMRI to identify neural representations of the shared conceptual meaning of the same story as heard by native speakers of three languages: English, Chinese, and French. We found that LMs trained on entirely different languages converge onto a similar embedding space, especially in the middle layers. We then aimed to find if a similar shared space exists in the brains of different native speakers of the three languages. We trained voxelwise encoding models that align the LM embeddings with neural responses from one group of subjects speaking a single language. We then used the encoding models trained on one language to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsALIGN
