IncogniText: Privacy-enhancing Conditional Text Anonymization via   LLM-based Private Attribute Randomization

Ahmed Frikha; Nassim Walha; Krishna Kanth Nakka; Ricardo Mendes; Xue; Jiang; Xuebing Zhou

arXiv:2407.02956·cs.CR·February 4, 2025

IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization

Ahmed Frikha, Nassim Walha, Krishna Kanth Nakka, Ricardo Mendes, Xue, Jiang, Xuebing Zhou

PDF

Open Access

TL;DR

IncogniText is a novel text anonymization method that uses LLM-based randomization to significantly reduce private attribute leakage while preserving text utility, suitable for real-world deployment.

Contribution

We introduce IncogniText, a privacy-preserving text anonymization technique leveraging LLMs to mislead adversaries and maintain utility, with practical on-device implementation.

Findings

01

Over 90% reduction in private attribute leakage across 8 attributes

02

Effective on-device anonymization with limited utility loss

03

Demonstrated applicability in real-world scenarios

Abstract

In this work, we address the problem of text anonymization where the goal is to prevent adversaries from correctly inferring private attributes of the author, while keeping the text utility, i.e., meaning and semantics. We propose IncogniText, a technique that anonymizes the text to mislead a potential adversary into predicting a wrong private attribute value. Our empirical evaluation shows a reduction of private attribute leakage by more than 90% across 8 different private attributes. Finally, we demonstrate the maturity of IncogniText for real-world applications by distilling its anonymization capability into a set of LoRA parameters associated with an on-device model. Our results show the possibility of reducing privacy leakage by more than half with limited impact on utility.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Privacy, Security, and Data Protection

MethodsSparse Evolutionary Training