Binary Embedding-based Retrieval at Tencent
Yukang Gan, Yixiao Ge, Chang Zhou, Shupeng Su, Zhouchuan Xu, Xuyuan, Xu, Quanchao Hui, Xiang Chen, Yexin Wang, Ying Shan

TL;DR
This paper introduces a binary embedding retrieval system that compresses embeddings to save costs and improve efficiency, supporting large-scale industrial search applications with minimal accuracy loss.
Contribution
It proposes a novel recurrent binarization algorithm with task-agnostic training and a symmetric distance calculation for efficient large-scale retrieval.
Findings
Achieves 30-50% index cost savings with minimal accuracy loss.
Supports multi-version indexing within a unified system.
Demonstrates effectiveness through extensive offline and online experiments.
Abstract
Large-scale embedding-based retrieval (EBR) is the cornerstone of search-related industrial applications. Given a user query, the system of EBR aims to identify relevant information from a large corpus of documents that may be tens or hundreds of billions in size. The storage and computation turn out to be expensive and inefficient with massive documents and high concurrent queries, making it difficult to further scale up. To tackle the challenge, we propose a binary embedding-based retrieval (BEBR) engine equipped with a recurrent binarization algorithm that enables customized bits per dimension. Specifically, we compress the full-precision query and document embeddings, formulated as float vectors in general, into a composition of multiple binary vectors using a lightweight transformation model with residual multilayer perception (MLP) blocks. We can therefore tailor the number of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Text and Document Classification Technologies · Domain Adaptation and Few-Shot Learning
