FRAG: Toward Federated Vector Database Management for Collaborative and Secure Retrieval-Augmented Generation
Dongfang Zhao

TL;DR
FRAG is a new federated database system enabling secure, privacy-preserving vector searches for large language model applications, balancing strong security with practical performance.
Contribution
FRAG introduces a novel federated vector database management paradigm with secure encrypted searches and a multiplicative caching technique for efficiency.
Findings
Achieves IND-CPA security under practical assumptions.
Demonstrates scalable performance on large datasets.
Provides rigorous security proofs and extensive experimental validation.
Abstract
This paper introduces \textit{Federated Retrieval-Augmented Generation (FRAG)}, a novel database management paradigm tailored for the growing needs of retrieval-augmented generation (RAG) systems, which are increasingly powered by large-language models (LLMs). FRAG enables mutually-distrusted parties to collaboratively perform Approximate -Nearest Neighbor (ANN) searches on encrypted query vectors and encrypted data stored in distributed vector databases, all while ensuring that no party can gain any knowledge about the queries or data of others. Achieving this paradigm presents two key challenges: (i) ensuring strong security guarantees, such as Indistinguishability under Chosen-Plaintext Attack (IND-CPA), under practical assumptions (e.g., we avoid overly optimistic assumptions like non-collusion among parties); and (ii) maintaining performance overheads comparable to traditional,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDistributed systems and fault tolerance · Advanced Database Systems and Queries · Access Control and Trust
