PPMI: Privacy-Preserving LLM Interaction with Socratic Chain-of-Thought Reasoning and Homomorphically Encrypted Vector Databases

Yubeen Bae; Minchan Kim; Jaejin Lee; Sangbum Kim; Jaehyung Kim; Yejin Choi; Niloofar Mireshghallah

arXiv:2506.17336·cs.CR·November 4, 2025

PPMI: Privacy-Preserving LLM Interaction with Socratic Chain-of-Thought Reasoning and Homomorphically Encrypted Vector Databases

Yubeen Bae, Minchan Kim, Jaejin Lee, Sangbum Kim, Jaehyung Kim, Yejin Choi, Niloofar Mireshghallah

PDF

Open Access

TL;DR

This paper introduces a hybrid system that enables privacy-preserving interaction with large language models by combining untrusted LLMs with local models and encrypted databases, improving privacy and performance.

Contribution

It presents a novel framework that uses Socratic reasoning and homomorphic encryption to securely access private data during LLM interactions, bridging the gap between privacy and model strength.

Findings

01

Outperforms GPT-4o alone by up to 7.1 percentage points on LoCoMo benchmark.

02

Demonstrates effective encrypted semantic search over one million personal data entries.

03

Shows feasibility of decomposing tasks between untrusted LLMs and local models for privacy.

Abstract

Large language models (LLMs) are increasingly used as personal agents, accessing sensitive user data such as calendars, emails, and medical records. Users currently face a trade-off: They can send private records, many of which are stored in remote databases, to powerful but untrusted LLM providers, increasing their exposure risk. Alternatively, they can run less powerful models locally on trusted devices. We bridge this gap. Our Socratic Chain-of-Thought Reasoning first sends a generic, non-private user query to a powerful, untrusted LLM, which generates a Chain-of-Thought (CoT) prompt and detailed sub-queries without accessing user data. Next, we embed these sub-queries and perform encrypted sub-second semantic search using our Homomorphically Encrypted Vector Database across one million entries of a single user's private data. This represents a realistic scale of personal documents,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Cryptography and Data Security · Adversarial Robustness in Machine Learning