How do we measure privacy in text? A survey of text anonymization metrics
Yaxuan Ren, Krithika Ramesh, Yaxing Yao, Anjalie Field

TL;DR
This survey reviews and compares various metrics for evaluating privacy in text anonymization, highlighting gaps and aligning them with legal and user standards to improve privacy assessment practices.
Contribution
It systematically categorizes privacy notions and metrics in text anonymization, providing guidance and identifying gaps for more robust privacy evaluation methods.
Findings
Identifies six distinct privacy notions in text anonymization
Analyzes alignment of metrics with HIPAA and GDPR standards
Highlights gaps and practical challenges in current privacy evaluation practices
Abstract
In this work, we aim to clarify and reconcile metrics for evaluating privacy protection in text through a systematic survey. Although text anonymization is essential for enabling NLP research and model development in domains with sensitive data, evaluating whether anonymization methods sufficiently protect privacy remains an open challenge. In manually reviewing 47 papers that report privacy metrics, we identify and compare six distinct privacy notions, and analyze how the associated metrics capture different aspects of privacy risk. We then assess how well these notions align with legal privacy standards (HIPAA and GDPR), as well as user-centered expectations grounded in HCI studies. Our analysis offers practical guidance on navigating the landscape of privacy evaluation approaches further and highlights gaps in current practices. Ultimately, we aim to facilitate more robust,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy, Security, and Data Protection · Privacy-Preserving Technologies in Data · Authorship Attribution and Profiling
