Adaptive Backtracking for Privacy Protection in Large Language Models
Zhihao Yao, Yuxuan Gu, Xiachong Feng, Weitao Ma, Bo Li, Xiaocheng Feng

TL;DR
This paper introduces ABack, a training-free method for enhancing enterprise privacy in large language models, along with PriGenQA, a new benchmark for evaluation, achieving up to 15% privacy utility improvement.
Contribution
The paper presents ABack, a novel, training-free approach leveraging Hidden State Models for enterprise privacy, and introduces PriGenQA, a benchmark dataset for evaluating privacy in healthcare and finance.
Findings
ABack improves privacy utility scores by up to 15%.
It avoids performance degradation common in prior methods.
The PriGenQA benchmark enables rigorous evaluation against adaptive attacks.
Abstract
The preservation of privacy has emerged as a critical topic in the era of artificial intelligence. However, current work focuses on user-oriented privacy, overlooking severe enterprise data leakage risks exacerbated by the Retrieval-Augmented Generation paradigm. To address this gap, our paper introduces a novel objective: enterprise-oriented privacy concerns. Achieving this objective requires overcoming two fundamental challenges: existing methods such as data sanitization severely degrade model performance, and the field lacks public datasets for evaluation. We address these challenges with several solutions. (1) To prevent performance degradation, we propose ABack, a training-free mechanism that leverages a Hidden State Model to pinpoint the origin of a leakage intention and rewrite the output safely. (2) To solve the lack of datasets, we construct PriGenQA, a new benchmark for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data
