ReinPool: Reinforcement Learning Pooling Multi-Vector Embeddings for Retrieval System
Sungguk Cha, DongWook Kim, Mintae Kim, Youngsub Han, Byoung-Ki Jeon, Sangyeob Lee

TL;DR
ReinPool is a reinforcement learning framework that efficiently compresses multi-vector embeddings into compact representations, maintaining high retrieval performance and significantly reducing index size for scalable document retrieval.
Contribution
It introduces a novel RL-based method to dynamically select and pool embeddings, outperforming static methods and enabling scalable multi-vector retrieval.
Findings
ReinPool compresses embeddings by up to 1249x.
ReinPool recovers 76-81% of full multi-vector retrieval performance.
ReinPool improves NDCG@3 by 22-33% over static baselines.
Abstract
Multi-vector embedding models have emerged as a powerful paradigm for document retrieval, preserving fine-grained visual and textual details through token-level representations. However, this expressiveness comes at a staggering cost: storing embeddings for every token inflates index sizes by over compared to single-vector approaches, severely limiting scalability. We introduce \textbf{ReinPool}, a reinforcement learning framework that learns to dynamically filter and pool multi-vector embeddings into compact, retrieval-optimized representations. By training with an inverse retrieval objective and NDCG-based rewards, ReinPool identifies and retains only the most discriminative vectors without requiring manual importance annotations. On the Vidore V2 benchmark across three vision-language embedding models, ReinPool compresses multi-vector representations by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Topic Modeling · Handwritten Text Recognition Techniques
