Hierarchical Patch Compression for ColPali: Efficient Multi-Vector Document Retrieval with Dynamic Pruning and Quantization
Duong Bach

TL;DR
HPC-ColPali introduces a hierarchical patch compression framework that significantly reduces storage and computation costs in multi-vector document retrieval systems while maintaining high accuracy, enabling faster and more scalable retrieval.
Contribution
The paper presents a novel hierarchical patch compression method combining quantization, dynamic pruning, and binary encoding to improve efficiency in multi-vector retrieval systems.
Findings
Achieves 30-50% lower query latency with maintained retrieval precision.
Reduces storage by up to 32x through K-Means quantization.
Decreases late-interaction computation by up to 60% with minimal accuracy loss.
Abstract
Multi-vector document retrieval systems, such as ColPali, excel in fine-grained matching for complex queries but incur significant storage and computational costs due to their reliance on high-dimensional patch embeddings and late-interaction scoring. To address these challenges, we propose HPC-ColPali, a Hierarchical Patch Compression framework that enhances the efficiency of ColPali while preserving its retrieval accuracy. Our approach integrates three innovative techniques: (1) K-Means quantization, which compresses patch embeddings into 1-byte centroid indices, achieving up to 32 storage reduction; (2) attention-guided dynamic pruning, utilizing Vision-Language Model attention weights to retain only the top- most salient patches, reducing late-interaction computation by up to 60\% with less than 2\% nDCG@10 loss; and (3) optional binary encoding of centroid indices into…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAlgorithms and Data Compression · Information Retrieval and Search Behavior · Advanced Image and Video Retrieval Techniques
