Privacy Policy Enforcement Guardrails for Data-Sensitive Retrieval-Augmented Generation
Osama Zafar, Alexander Nemecek, Yiqian Zhang, Wenbiao Li, Debargha Ganguly, Vikash Singh, Vipin Chaudhary, Erman Ayday

TL;DR
This paper presents a novel privacy enforcement framework for data-sensitive retrieval systems, effectively detecting contextual data leakage with high accuracy and low latency using dual density estimators and synthetic data.
Contribution
It introduces a new PPE framework with dual one-class density estimators and a calibrated abstain region, outperforming traditional methods in detecting borderline-safe data leaks.
Findings
Achieves a borderline AUROC of 0.93+ on stress tests.
Reduces false positives by 44-55 percentage points.
Maintains millisecond latency for real-time detection.
Abstract
Standard PII filters often miss contextual data leakage in RAG systems, such as non-regulated attribute clusters that collectively identify individuals. We introduce a Privacy Policy Enforcement (PPE) framework using dual one-class density estimators with fused text embeddings and a calibrated abstain region for out-of-distribution inputs. Using an axis-stratified, multi-LLM synthetic data pipeline across medicine, finance, and law, we found that traditional Gaussian Mixture baselines fail on borderline-safe stress tests by focusing on linguistic register rather than content. Our proposed T3+OCSVM detector, trained on safe and borderline-safe data, achieves a borderline AUROC of 0.93+ while reducing false positives by 44-55 percentage points and maintaining millisecond latency. Compared to supervised MLP classifiers or 14B-parameter LLM judges, our framework offers superior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
