Dimension vs. Precision: A Comparative Analysis of Autoencoders and Quantization for Efficient Vector Retrieval on BEIR SciFact
Satyanarayan Pati (Involead Services Pvt Ltd, Delhi, India)

TL;DR
This paper empirically compares autoencoders and quantization techniques for compressing high-dimensional vectors in dense retrieval, finding that int8 quantization offers an optimal balance of compression and performance.
Contribution
It provides a systematic evaluation of autoencoders versus quantization for vector compression in retrieval, highlighting the effectiveness of int8 quantization for practical deployment.
Findings
Int8 quantization achieves 4x compression with minimal performance loss.
Autoencoders degrade more at similar compression ratios.
Binary quantization performs poorly due to catastrophic drops.
Abstract
Dense retrieval models have become a standard for state-of-the-art information retrieval. However, their high-dimensional, high-precision (float32) vector embeddings create significant storage and memory challenges for real-world deployment. To address this, we conduct a rigorous empirical study on the BEIR SciFact benchmark, evaluating the trade-offs between two primary compression strategies: (1) Dimensionality Reduction via deep Autoencoders (AE), reducing original 384-dim vectors to latent spaces from 384 down to 12, and (2) Precision Reduction via Quantization (float16, int8, and binary). We systematically compare each method by measuring the "performance loss" (or gain) relative to a float32 baseline across a full suite of retrieval metrics (NDCG, MAP, MRR, Recall, Precision) at various k cutoffs. Our results show that int8 scalar quantization provides the most effective "sweet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Information Retrieval and Search Behavior · Advanced Neural Network Applications
