Principled and Scalable Diversity-Aware Retrieval via Cardinality-Constrained Binary Quadratic Programming
Qiheng Lu, Nicholas D. Sidiropoulos

TL;DR
This paper introduces a scalable, theoretically grounded method for diversity-aware retrieval in RAG systems, using a novel cardinality-constrained quadratic programming approach with proven convergence.
Contribution
It formulates diversity retrieval as a CCBQP problem, providing a non-convex relaxation and a Frank--Wolfe algorithm with convergence guarantees, improving scalability and performance.
Findings
Our method outperforms baselines on relevance-diversity trade-offs.
Achieves significant speedup over existing approaches.
Provides theoretical guarantees for convergence and landscape analysis.
Abstract
Diversity-aware retrieval is essential for Retrieval-Augmented Generation (RAG), yet existing methods lack theoretical guarantees and face scalability issues as the number of retrieved passages increases. We propose a principled formulation of diversity retrieval as a cardinality-constrained binary quadratic programming (CCBQP), which explicitly balances relevance and semantic diversity through an interpretable trade-off parameter. Inspired by recent advances in combinatorial optimization, we develop a non-convex tight continuous relaxation and a Frank--Wolfe based algorithm with landscape analysis and convergence guarantees. Extensive experiments demonstrate that our method consistently dominates baselines on the relevance-diversity Pareto frontier, while achieving significant speedup.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
