When differential privacy meets NLP: The devil is in the detail

Ivan Habernal

arXiv:2109.03175·cs.CL·March 7, 2022

When differential privacy meets NLP: The devil is in the detail

Ivan Habernal

PDF

2 Repos

TL;DR

This paper critically analyzes ADePT, a differentially private text auto-encoder, revealing it does not meet differential privacy standards despite promising results, emphasizing the need for rigorous formal guarantees in NLP privacy applications.

Contribution

It provides a formal analysis showing ADePT's privacy guarantees are invalid, highlighting the importance of thorough scrutiny of privacy claims in NLP models.

Findings

01

ADePT is not truly differentially private.

02

The true sensitivity of ADePT's mechanism is at least six times higher.

03

Up to 100% of utterances could be unprivatized in practice.

Abstract

Differential privacy provides a formal approach to privacy of individuals. Applications of differential privacy in various scenarios, such as protecting users' original utterances, must satisfy certain mathematical properties. Our contribution is a formal analysis of ADePT, a differentially private auto-encoder for text rewriting (Krishna et al, 2021). ADePT achieves promising results on downstream tasks while providing tight privacy guarantees. Our proof reveals that ADePT is not differentially private, thus rendering the experimental results unsubstantiated. We also quantify the impact of the error in its private mechanism, showing that the true sensitivity is higher by at least factor 6 in an optimistic case of a very small encoder's dimension and that the amount of utterances that are not privatized could easily reach 100% of the entire dataset. Our intention is neither to criticize…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.