One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge Neurons in Large Language Models
Pengfei Cao, Yuheng Chen, Zhuoran Jin, Yubo Chen, Kang Liu, Jun Zhao

TL;DR
This paper introduces a new benchmark and method to accurately identify language-agnostic knowledge neurons in large language models across multiple languages, enhancing understanding of how factual knowledge is stored and utilized.
Contribution
It proposes RML-LAMA, a multilingual benchmark, and MATRICE, a method for uncertainty-aware localization of knowledge neurons in LLMs, addressing previous limitations.
Findings
MATRICE accurately localizes language-agnostic knowledge neurons.
The study demonstrates the role of these neurons in cross-lingual tasks.
Enhanced understanding of knowledge storage in multilingual LLMs.
Abstract
Large language models (LLMs) have learned vast amounts of factual knowledge through self-supervised pre-training on large-scale corpora. Meanwhile, LLMs have also demonstrated excellent multilingual capabilities, which can express the learned knowledge in multiple languages. However, the knowledge storage mechanism in LLMs still remains mysterious. Some researchers attempt to demystify the factual knowledge in LLMs from the perspective of knowledge neurons, and subsequently discover language-agnostic knowledge neurons that store factual knowledge in a form that transcends language barriers. However, the preliminary finding suffers from two limitations: 1) High Uncertainty in Localization Results. Existing study only uses a prompt-based probe to localize knowledge neurons for each fact, while LLMs cannot provide consistent answers for semantically equivalent queries. Thus, it leads to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques
MethodsTanh Activation · Softmax · Low-Rank Factorization-based Multi-Head Attention
