One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge   Neurons in Large Language Models

Pengfei Cao; Yuheng Chen; Zhuoran Jin; Yubo Chen; Kang Liu; Jun Zhao

arXiv:2411.17401·cs.CL·November 27, 2024

One Mind, Many Tongues: A Deep Dive into Language-Agnostic Knowledge Neurons in Large Language Models

Pengfei Cao, Yuheng Chen, Zhuoran Jin, Yubo Chen, Kang Liu, Jun Zhao

PDF

Open Access

TL;DR

This paper introduces a new benchmark and method to accurately identify language-agnostic knowledge neurons in large language models across multiple languages, enhancing understanding of how factual knowledge is stored and utilized.

Contribution

It proposes RML-LAMA, a multilingual benchmark, and MATRICE, a method for uncertainty-aware localization of knowledge neurons in LLMs, addressing previous limitations.

Findings

01

MATRICE accurately localizes language-agnostic knowledge neurons.

02

The study demonstrates the role of these neurons in cross-lingual tasks.

03

Enhanced understanding of knowledge storage in multilingual LLMs.

Abstract

Large language models (LLMs) have learned vast amounts of factual knowledge through self-supervised pre-training on large-scale corpora. Meanwhile, LLMs have also demonstrated excellent multilingual capabilities, which can express the learned knowledge in multiple languages. However, the knowledge storage mechanism in LLMs still remains mysterious. Some researchers attempt to demystify the factual knowledge in LLMs from the perspective of knowledge neurons, and subsequently discover language-agnostic knowledge neurons that store factual knowledge in a form that transcends language barriers. However, the preliminary finding suffers from two limitations: 1) High Uncertainty in Localization Results. Existing study only uses a prompt-based probe to localize knowledge neurons for each fact, while LLMs cannot provide consistent answers for semantically equivalent queries. Thus, it leads to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques

MethodsTanh Activation · Softmax · Low-Rank Factorization-based Multi-Head Attention