Inside the Black Box: Detecting Data Leakage in Pre-trained Language   Encoders

Yuan Xin; Zheng Li; Ning Yu; Dingfan Chen; Mario Fritz; Michael Backes; and Yang Zhang

arXiv:2408.11046·cs.CL·August 21, 2024

Inside the Black Box: Detecting Data Leakage in Pre-trained Language Encoders

Yuan Xin, Zheng Li, Ning Yu, Dingfan Chen, Mario Fritz, Michael Backes, and Yang Zhang

PDF

Open Access

TL;DR

This paper systematically investigates privacy risks, specifically membership data leakage, in pre-trained language encoders, revealing leakage even with black-box access and across various architectures and tasks.

Contribution

It is the first comprehensive study demonstrating membership leakage in pre-trained language models through black-box outputs, across multiple architectures and datasets.

Findings

01

Membership leakage exists even with only black-box model outputs.

02

Leakage is consistent across different encoder architectures and downstream tasks.

03

Provides insights for improving privacy protections in NLP models.

Abstract

Despite being prevalent in the general field of Natural Language Processing (NLP), pre-trained language models inherently carry privacy and copyright concerns due to their nature of training on large-scale web-scraped data. In this paper, we pioneer a systematic exploration of such risks associated with pre-trained language encoders, specifically focusing on the membership leakage of pre-training data exposed through downstream models adapted from pre-trained language encoders-an aspect largely overlooked in existing literature. Our study encompasses comprehensive experiments across four types of pre-trained encoder architectures, three representative downstream tasks, and five benchmark datasets. Intriguingly, our evaluations reveal, for the first time, the existence of membership leakage even when only the black-box output of the downstream model is exposed, highlighting a privacy…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Context-Aware Activity Recognition Systems · Anomaly Detection Techniques and Applications