Optimized Cartesian $K$-Means
Jianfeng Wang, Jingdong Wang, Jingkuan Song, Xin-Shun Xu, Heng Tao, Shen, Shipeng Li

TL;DR
The paper introduces Optimized Cartesian K-Means (OCKM), a novel encoding method that uses multiple sub codewords from different sub codebooks to improve the accuracy of approximate nearest neighbor search in high-dimensional data.
Contribution
OCKM extends traditional product quantization by encoding each subvector with multiple sub codewords from optimally generated sub codebooks, reducing distortion and enhancing search accuracy.
Findings
Outperforms state-of-the-art methods on real datasets
Provides more flexible and lower-distortion encoding
Improves approximate nearest neighbor search accuracy
Abstract
Product quantization-based approaches are effective to encode high-dimensional data points for approximate nearest neighbor search. The space is decomposed into a Cartesian product of low-dimensional subspaces, each of which generates a sub codebook. Data points are encoded as compact binary codes using these sub codebooks, and the distance between two data points can be approximated efficiently from their codes by the precomputed lookup tables. Traditionally, to encode a subvector of a data point in a subspace, only one sub codeword in the corresponding sub codebook is selected, which may impose strict restrictions on the search accuracy. In this paper, we propose a novel approach, named Optimized Cartesian -Means (OCKM), to better encode the data points for more accurate approximate nearest neighbor search. In OCKM, multiple sub codewords are used to encode the subvector of a data…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Remote-Sensing Image Classification · Robotics and Sensor-Based Localization
