Unintended Memorization and Timing Attacks in Named Entity Recognition Models
Rana Salal Ali, Benjamin Zi Hao Zhao, Hassan Jameel Asghar and, Tham Nguyen, Ian David Wood, Dali Kaafar

TL;DR
This paper reveals that named entity recognition models are vulnerable to membership inference attacks through unintended memorization and timing side-channels, posing privacy risks in sensitive data redaction applications.
Contribution
It demonstrates two novel attacks on NER models—memorization-based and timing-based—and evaluates their effectiveness, highlighting privacy vulnerabilities in real-world applications.
Findings
70% AUC in memorization attack
99.23% AUC in timing attack
Memorization occurs even with a single phrase
Abstract
Named entity recognition models (NER), are widely used for identifying named entities (e.g., individuals, locations, and other information) in text documents. Machine learning based NER models are increasingly being applied in privacy-sensitive applications that need automatic and scalable identification of sensitive information to redact text for data sharing. In this paper, we study the setting when NER models are available as a black-box service for identifying sensitive information in user documents and show that these models are vulnerable to membership inference on their training datasets. With updated pre-trained NER models from spaCy, we demonstrate two distinct membership attacks on these models. Our first attack capitalizes on unintended memorization in the NER's underlying neural network, a phenomenon NNs are known to be vulnerable to. Our second attack leverages a timing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Privacy-Preserving Technologies in Data · Machine Learning in Healthcare
Methodstravel james
