A Neighbourhood-Aware Differential Privacy Mechanism for Static Word   Embeddings

Danushka Bollegala; Shuichi Otake; Tomoya Machide; Ken-ichi; Kawarabayashi

arXiv:2309.10551·cs.LG·September 20, 2023

A Neighbourhood-Aware Differential Privacy Mechanism for Static Word Embeddings

Danushka Bollegala, Shuichi Otake, Tomoya Machide, Ken-ichi, Kawarabayashi

PDF

Open Access 1 Repo

TL;DR

This paper introduces a neighbourhood-aware differential privacy mechanism for static word embeddings that adaptively adds noise based on local neighborhood structure, improving privacy and utility in NLP tasks.

Contribution

The paper presents a novel NADP mechanism that considers word neighborhoods in embedding space to optimize privacy noise application, outperforming existing DP methods.

Findings

01

NADP outperforms Laplacian, Gaussian, and Mahalanobis DP mechanisms in downstream tasks.

02

NADP guarantees higher privacy levels while maintaining utility.

03

Constructing a neighborhood graph enables adaptive noise application.

Abstract

We propose a Neighbourhood-Aware Differential Privacy (NADP) mechanism considering the neighbourhood of a word in a pretrained static word embedding space to determine the minimal amount of noise required to guarantee a specified privacy level. We first construct a nearest neighbour graph over the words using their embeddings, and factorise it into a set of connected components (i.e. neighbourhoods). We then separately apply different levels of Gaussian noise to the words in each neighbourhood, determined by the set of words in that neighbourhood. Experiments show that our proposed NADP mechanism consistently outperforms multiple previously proposed DP mechanisms such as Laplacian, Gaussian, and Mahalanobis in multiple downstream tasks, while guaranteeing higher levels of privacy.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

shuichiotake/nadp
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Internet Traffic Analysis and Secure E-voting · Privacy, Security, and Data Protection