Privacy Evaluation Benchmarks for NLP Models
Wei Huang, Yinggui Wang, Cen Chen

TL;DR
This paper introduces a comprehensive benchmark for evaluating privacy attacks and defenses on NLP models, including various models, datasets, and protocols, to systematically understand privacy risks.
Contribution
It presents a holistic benchmark framework for privacy assessment in NLP, introduces an improved attack method using Knowledge Distillation, and proposes a chained attack framework for higher-level objectives.
Findings
A new benchmark supports diverse models and datasets for privacy evaluation.
An improved attack method leveraging Knowledge Distillation enhances attack effectiveness.
A chained attack framework enables combining multiple attacks for stronger privacy breaches.
Abstract
By inducing privacy attacks on NLP models, attackers can obtain sensitive information such as training data and model parameters, etc. Although researchers have studied, in-depth, several kinds of attacks in NLP models, they are non-systematic analyses. It lacks a comprehensive understanding of the impact caused by the attacks. For example, we must consider which scenarios can apply to which attacks, what the common factors are that affect the performance of different attacks, the nature of the relationships between different attacks, and the influence of various datasets and models on the effectiveness of the attacks, etc. Therefore, we need a benchmark to holistically assess the privacy risks faced by NLP models. In this paper, we present a privacy attack and defense evaluation benchmark in the field of NLP, which includes the conventional/small models and large language models…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI
MethodsKnowledge Distillation
