ClusterRAG: Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation
Gibson Nkhata, Uttamasha Anjally Oyshi, Quan Mai, Susan Gauch

TL;DR
ClusterRAG introduces a cluster-based collaborative filtering approach to improve personalized document retrieval in RAG systems, reducing costs and enhancing relevance by leveraging user clusters.
Contribution
It proposes a novel clustering-based method that combines user profiles and collaborative signals to enhance retrieval accuracy in RAG models.
Findings
ClusterRAG outperforms existing methods on the LaMP benchmark.
It effectively integrates with various dense retrievers and rankers.
The approach remains effective with both fine-tuned and zero-shot language models.
Abstract
Personalized Retrieval-Augmented Generation (RAG) relies on accurately selecting user-relevant documents. In practice, existing RAG approaches often suffer from high retrieval costs and overlook that collaborative signals from similar users can enhance personalized generation for the current user. We propose ClusterRAG, a Cluster-Based Collaborative Filtering for Personalized Retrieval-Augmented Generation. ClusterRAG represents users through their profile documents, organizes users into semantically coherent clusters using density-based clustering, and performs retrieval at both the cluster and document levels via cluster-level similarity and fine-grained ranking. Extensive experiments on the LaMP benchmark demonstrate that jointly leveraging the target user's profile and profiles from top similar users consistently yields the best performance across diverse tasks. Further analysis…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
