IncogniText: Privacy-enhancing Conditional Text Anonymization via LLM-based Private Attribute Randomization
Ahmed Frikha, Nassim Walha, Krishna Kanth Nakka, Ricardo Mendes, Xue, Jiang, Xuebing Zhou

TL;DR
IncogniText is a novel text anonymization method that uses LLM-based randomization to significantly reduce private attribute leakage while preserving text utility, suitable for real-world deployment.
Contribution
We introduce IncogniText, a privacy-preserving text anonymization technique leveraging LLMs to mislead adversaries and maintain utility, with practical on-device implementation.
Findings
Over 90% reduction in private attribute leakage across 8 attributes
Effective on-device anonymization with limited utility loss
Demonstrated applicability in real-world scenarios
Abstract
In this work, we address the problem of text anonymization where the goal is to prevent adversaries from correctly inferring private attributes of the author, while keeping the text utility, i.e., meaning and semantics. We propose IncogniText, a technique that anonymizes the text to mislead a potential adversary into predicting a wrong private attribute value. Our empirical evaluation shows a reduction of private attribute leakage by more than 90% across 8 different private attributes. Finally, we demonstrate the maturity of IncogniText for real-world applications by distilling its anonymization capability into a set of LoRA parameters associated with an on-device model. Our results show the possibility of reducing privacy leakage by more than half with limited impact on utility.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Privacy, Security, and Data Protection
MethodsSparse Evolutionary Training
