Understanding and Mitigating Cross-lingual Privacy Leakage via Language-specific and Universal Privacy Neurons

Wenshuo Dong; Qingsong Yang; Shu Yang; Lijie Hu; Meng Ding; Wanyu Lin; Tianhang Zheng; Di Wang

arXiv:2506.00759·cs.CL·June 10, 2025

Understanding and Mitigating Cross-lingual Privacy Leakage via Language-specific and Universal Privacy Neurons

Wenshuo Dong, Qingsong Yang, Shu Yang, Lijie Hu, Meng Ding, Wanyu Lin, Tianhang Zheng, Di Wang

PDF

Open Access

TL;DR

This paper investigates how privacy leakage occurs in multilingual large language models and proposes neuron-based methods to mitigate cross-lingual privacy risks, reducing leakage by up to 31.6%.

Contribution

It introduces the concept of privacy-universal and language-specific privacy neurons and demonstrates their effectiveness in reducing privacy leakage across languages.

Findings

01

Privacy leakage peaks in language-specific model layers.

02

Deactivating identified privacy neurons reduces leakage by up to 31.6%.

03

Shared middle-layer representations contribute to cross-lingual privacy risks.

Abstract

Large Language Models (LLMs) trained on massive data capture rich information embedded in the training data. However, this also introduces the risk of privacy leakage, particularly involving personally identifiable information (PII). Although previous studies have shown that this risk can be mitigated through methods such as privacy neurons, they all assume that both the (sensitive) training data and user queries are in English. We show that they cannot defend against the privacy leakage in cross-lingual contexts: even if the training data is exclusively in one language, these (private) models may still reveal private information when queried in another language. In this work, we first investigate the information flow of cross-lingual privacy leakage to give a better understanding. We find that LLMs process private information in the middle layers, where representations are largely…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Big Data and Digital Economy