On the Vulnerability of Text Sanitization
Meng Tong, Kejiang Chen, Xiaojian Yuan, Jiayang Liu, Weiming Zhang,, Nenghai Yu, Jie Zhang

TL;DR
This paper critically evaluates the privacy protection effectiveness of text sanitization by developing theoretically optimal and practical reconstruction attacks, revealing significant vulnerabilities and prompting a reassessment of current sanitization methods.
Contribution
It introduces theoretically grounded and practical reconstruction attacks that outperform existing methods, providing a more accurate assessment of text sanitization privacy risks.
Findings
One attack improved attack success rate by 46.4% over baseline.
Revealed significant vulnerabilities in current text sanitization methods.
Provided bounds on attack success rate as benchmarks for evaluation.
Abstract
Text sanitization, which employs differential privacy to replace sensitive tokens with new ones, represents a significant technique for privacy protection. Typically, its performance in preserving privacy is evaluated by measuring the attack success rate (ASR) of reconstruction attacks, where attackers attempt to recover the original tokens from the sanitized ones. However, current reconstruction attacks on text sanitization are developed empirically, making it challenging to accurately assess the effectiveness of sanitization. In this paper, we aim to provide a more accurate evaluation of sanitization effectiveness. Inspired by the works of Palamidessi et al., we implement theoretically optimal reconstruction attacks targeting text sanitization. We derive their bounds on ASR as benchmarks for evaluating sanitization performance. For real-world applications, we propose two practical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsDigital and Cyber Forensics
