Estimating Privacy Leakage of Augmented Contextual Knowledge in Language Models
James Flemings, Bo Jiang, Wanrong Zhang, Zafar Takhirov, Murali Annavaram

TL;DR
This paper introduces a new metric called context influence, based on differential privacy, to accurately estimate privacy leakage in language models caused by augmented contextual knowledge, addressing overestimation issues of previous methods.
Contribution
It proposes the context influence metric to measure privacy leakage from contextual knowledge in language models, effectively isolating the impact of augmented contexts.
Findings
Context privacy leakage occurs with out-of-distribution contexts.
Context influence accurately attributes privacy leakage to augmented contexts.
Factors like model size and context size affect privacy leakage.
Abstract
Language models (LMs) rely on their parametric knowledge augmented with relevant contextual knowledge for certain tasks, such as question answering. However, the contextual knowledge can contain private information that may be leaked when answering queries, and estimating this privacy leakage is not well understood. A straightforward approach of directly comparing an LM's output to the contexts can overestimate the privacy risk, since the LM's parametric knowledge might already contain the augmented contextual knowledge. To this end, we introduce *context influence*, a metric that builds on differential privacy, a widely-adopted privacy notion, to estimate the privacy leakage of contextual knowledge during decoding. Our approach effectively measures how each subset of the context influences an LM's response while separating the specific parametric knowledge of the LM. Using our context…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsAdvanced Text Analysis Techniques · Semantic Web and Ontologies · Natural Language Processing Techniques
MethodsLLaMA
