Privacy Bias in Language Models: A Contextual Integrity-based Auditing Metric

Yan Shvartzshnaider; Vasisht Duddu

arXiv:2409.03735·cs.LG·December 22, 2025

Privacy Bias in Language Models: A Contextual Integrity-based Auditing Metric

Yan Shvartzshnaider, Vasisht Duddu

PDF

Open Access

TL;DR

This paper introduces a new auditing metric based on contextual integrity to evaluate privacy biases in large language models, helping stakeholders assess ethical and societal impacts.

Contribution

It presents a novel methodology for reliably measuring privacy biases in LLM responses, considering prompt sensitivity and model factors.

Findings

01

Privacy bias varies with model capacity and optimization.

02

The proposed metric effectively detects privacy violations.

03

Sensitivity to prompt variations influences privacy bias assessments.

Abstract

As large language models (LLMs) are integrated into sociotechnical systems, it is crucial to examine the privacy biases they exhibit. We define privacy bias as the appropriateness value of information flows in responses from LLMs. A deviation between privacy biases and expected values, referred to as privacy bias delta, may indicate privacy violations. As an auditing metric, privacy bias can help (a) model trainers evaluate the ethical and societal impact of LLMs, (b) service providers select context-appropriate LLMs, and (c) policymakers assess the appropriateness of privacy biases in deployed LLMs. We formulate and answer a novel research question: how can we reliably examine privacy biases in LLMs and the factors that influence them? We present a novel approach for assessing privacy biases using a contextual integrity-based methodology to evaluate the responses from various LLMs. Our…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling

MethodsALIGN