DP-TabICL: In-Context Learning with Differentially Private Tabular Data
Alycia N. Carey, Karuna Bhaila, Kennedy Edemacu, Xintao Wu

TL;DR
This paper introduces methods to incorporate differential privacy into in-context learning with large language models for tabular data, ensuring data privacy without significantly sacrificing performance.
Contribution
It proposes two novel differential privacy frameworks for tabular ICL, providing formal privacy guarantees and demonstrating effectiveness on real-world datasets.
Findings
DP-TabICL protects sensitive tabular data during ICL
Performance remains comparable to non-private methods under high privacy
Frameworks are validated on eight real-world datasets
Abstract
In-context learning (ICL) enables large language models (LLMs) to adapt to new tasks by conditioning on demonstrations of question-answer pairs and it has been shown to have comparable performance to costly model retraining and fine-tuning. Recently, ICL has been extended to allow tabular data to be used as demonstration examples by serializing individual records into natural language formats. However, it has been shown that LLMs can leak information contained in prompts, and since tabular data often contain sensitive information, understanding how to protect the underlying tabular data used in ICL is a critical area of research. This work serves as an initial investigation into how to use differential privacy (DP) -- the long-established gold standard for data privacy and anonymization -- to protect tabular data used in ICL. Specifically, we investigate the application of DP mechanisms…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Data Quality and Management
