Coding for Random Projections
Ping Li, Michael Mitzenmacher, Anshumali Shrivastava

TL;DR
This paper explores coding schemes for random projections, showing that simple uniform quantization often outperforms existing methods and that low-bit coding can be effective for similarity estimation and training linear classifiers.
Contribution
It introduces and evaluates simple uniform and non-uniform 2-bit coding schemes for random projections, improving efficiency in similarity estimation and classifier training.
Findings
Uniform quantization outperforms existing influential methods.
Low-bit coding schemes are often sufficient for effective performance.
Non-uniform 2-bit coding performs well in practice, especially for training linear SVMs.
Abstract
The method of random projections has become very popular for large-scale applications in statistical learning, information retrieval, bio-informatics and other applications. Using a well-designed coding scheme for the projected data, which determines the number of bits needed for each projected value and how to allocate these bits, can significantly improve the effectiveness of the algorithm, in storage cost as well as computational speed. In this paper, we study a number of simple coding schemes, focusing on the task of similarity estimation and on an application to training linear classifiers. We demonstrate that uniform quantization outperforms the standard existing influential method (Datar et. al. 2004). Indeed, we argue that in many cases coding with just a small number of bits suffices. Furthermore, we also develop a non-uniform 2-bit coding scheme that generally performs well in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Algorithms and Data Compression · Machine Learning and Algorithms
