CAPE: Context-Aware Private Embeddings for Private Language Learning

Richard Plant; Dimitra Gkatzia; Valerio Giuffrida

arXiv:2108.12318·cs.CL·August 30, 2021

CAPE: Context-Aware Private Embeddings for Private Language Learning

Richard Plant, Dimitra Gkatzia, Valerio Giuffrida

PDF

Open Access 2 Repos

TL;DR

CAPE introduces a privacy-preserving method for training language embeddings by combining differential privacy and adversarial training to reduce private information leakage while maintaining semantic integrity.

Contribution

The paper presents CAPE, a novel approach that enhances privacy in language embeddings through combined differential privacy and adversarial training techniques.

Findings

01

CAPE effectively reduces private information leakage.

02

CAPE maintains semantic quality of embeddings.

03

CAPE outperforms single intervention privacy methods.

Abstract

Deep learning-based language models have achieved state-of-the-art results in a number of applications including sentiment analysis, topic labelling, intent classification and others. Obtaining text representations or embeddings using these models presents the possibility of encoding personally identifiable information learned from language and context cues that may present a risk to reputation or privacy. To ameliorate these issues, we propose Context-Aware Private Embeddings (CAPE), a novel approach which preserves privacy during training of embeddings. To maintain the privacy of text representations, CAPE applies calibrated noise through differential privacy, preserving the encoded semantic links while obscuring sensitive information. In addition, CAPE employs an adversarial training regime that obscures identified private variables. Experimental results demonstrate that the proposed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Adversarial Robustness in Machine Learning · Internet Traffic Analysis and Secure E-voting