Learn and Unlearn: Addressing Misinformation in Multilingual LLMs

Taiming Lu; Philipp Koehn

arXiv:2406.13748·cs.CL·September 4, 2025·1 cites

Learn and Unlearn: Addressing Misinformation in Multilingual LLMs

Taiming Lu, Philipp Koehn

PDF

Open Access 1 Repo 2 Datasets

TL;DR

This paper examines how harmful misinformation propagates in multilingual LLMs and shows that effective unlearning requires addressing all languages involved to prevent the spread of harmful content.

Contribution

It highlights the limitations of standard unlearning methods in multilingual settings and proposes the need for comprehensive strategies that consider multiple languages.

Findings

01

Harmful information spreads across languages in LLMs.

02

Standard unlearning methods are insufficient for multilingual models.

03

Addressing both English and original languages effectively removes harmful outputs.

Abstract

This paper investigates the propagation of harmful information in multilingual large language models (LLMs) and evaluates the efficacy of various unlearning methods. We demonstrate that fake information, regardless of the language it is in, once introduced into these models through training data, can spread across different languages, compromising the integrity and reliability of the generated content. Our findings reveal that standard unlearning techniques, which typically focus on English data, are insufficient in mitigating the spread of harmful content in multilingual contexts and could inadvertently reinforce harmful content across languages. We show that only by addressing harmful responses in both English and the original language of the harmful data can we effectively eliminate generations for all languages. This underscores the critical need for comprehensive unlearning…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

TaiMingLu/learn-unlearn
pytorchOfficial

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Library Science and Information Systems

MethodsFocus