Hybrid-Vector Retrieval for Visually Rich Documents: Combining Single-Vector Efficiency and Multi-Vector Accuracy
Juyeon Kim, Geon Lee, Dongwon Choi, Taeuk Kim, Kijung Shin

TL;DR
HEAVEN is a hybrid-vector retrieval framework that combines the efficiency of single-vector methods with the accuracy of multi-vector approaches for visually rich document retrieval.
Contribution
The paper introduces HEAVEN, a novel two-stage hybrid-vector framework that significantly improves retrieval efficiency while maintaining high accuracy for visually rich documents.
Findings
HEAVEN achieves 99.87% of multi-vector recall@1 performance.
Reduces per-query computation by 99.82%.
Introduces ViMDoc benchmark for realistic document retrieval evaluation.
Abstract
Retrieval over visually rich documents is essential for tasks such as legal discovery, scientific search, and enterprise knowledge management. Existing approaches fall into two paradigms: single-vector retrieval, which is efficient but coarse, and multi-vector retrieval, which is accurate but computationally expensive. To address this trade-off, we propose HEAVEN, a plug-and-play two-stage hybrid-vector framework. In the first stage, HEAVEN efficiently retrieves candidate pages using a single-vector method over Visually-Summarized Pages (VS-Pages), which assemble representative visual layouts from multiple pages. In the second stage, it reranks candidates with a multi-vector method while filtering query tokens by linguistic importance to reduce redundant computations. To evaluate retrieval systems under realistic conditions, we also introduce ViMDoc, a benchmark for visually rich,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
