Progressively Optimized Bi-Granular Document Representation for Scalable Embedding Based Retrieval
Shitao Xiao, Zheng Liu, Weihao Han, Jianjin Zhang, Yingxia Shao, Defu, Lian, Chaozhuo Li, Hao Sun, Denvy Deng, Liangjie Zhang, Qi Zhang, Xing Xie

TL;DR
This paper introduces a bi-granular document representation framework that combines sparse and dense embeddings, optimized progressively for scalable embedding-based retrieval, significantly improving accuracy and efficiency in large-scale search applications.
Contribution
The paper proposes a novel progressive optimization framework for bi-granular embeddings, enabling scalable and accurate retrieval in massive corpora with reduced memory footprint.
Findings
Up to +4.3% recall on million-scale corpus
Up to +17.5% recall on billion-scale corpus
Substantial gains in revenue, recall, and CTR in a real search platform
Abstract
Ad-hoc search calls for the selection of appropriate answers from a massive-scale corpus. Nowadays, the embedding-based retrieval (EBR) becomes a promising solution, where deep learning based document representation and ANN search techniques are allied to handle this task. However, a major challenge is that the ANN index can be too large to fit into memory, given the considerable size of answer corpus. In this work, we tackle this problem with Bi-Granular Document Representation, where the lightweight sparse embeddings are indexed and standby in memory for coarse-grained candidate search, and the heavyweight dense embeddings are hosted in disk for fine-grained post verification. For the best of retrieval accuracy, a Progressive Optimization framework is designed. The sparse embeddings are learned ahead for high-quality search of candidates. Conditioned on the candidate distribution…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications · Advanced Image and Video Retrieval Techniques
