Layer-Targeted Multilingual Knowledge Erasure in Large Language Models

Taoran Li; Varun Chandrasekaran; Zhiyuan Yu

arXiv:2602.22562·cs.CR·February 27, 2026

Layer-Targeted Multilingual Knowledge Erasure in Large Language Models

Taoran Li, Varun Chandrasekaran, Zhiyuan Yu

PDF

Open Access

TL;DR

This paper investigates how to effectively erase specific multilingual knowledge from large language models by identifying optimal intervention layers, proposing a targeted unlearning framework that ensures knowledge removal while preserving multilingual capabilities.

Contribution

It introduces MUTE, a novel framework that uses layer analysis to enable robust multilingual knowledge erasure in LLMs, addressing limitations of previous methods.

Findings

01

Intervention depth determines success of multilingual unlearning.

02

Shallow layers allow erasure but harm multilingual capabilities.

03

Deep layers preserve utility but fail to erase knowledge.

Abstract

Recent work has demonstrated that machine unlearning in Large Language Models (LLMs) fails to generalize across languages: knowledge erased in one language frequently remains accessible through others. However, the underlying cause of this failure and a principled solution remain open. In this work, we identify intervention depth as the key factor determining multilingual generalization. Through systematic layer-wise experiments, we characterize two distinct failure modes: shallow-layer interventions achieve erasure but collapse multilingual capabilities in held-out languages, while deep-layer interventions preserve utility but fail to erase target knowledge even in source languages. These findings reveal that the choice of intervention layer is not a free parameter; it fundamentally determines whether multilingual unlearning succeeds. We propose MUTE (Multilingual Unlearning via…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)