The Future is Sparse: Embedding Compression for Scalable Retrieval in Recommender Systems
Petr Kasalick\'y, Martin Spi\v{s}\'ak, Vojt\v{e}ch Van\v{c}ura, Daniel Bohun\v{e}k, Rodrigo Alves, Pavel Kord\'ik

TL;DR
This paper introduces a learnable embedding compression method that projects dense embeddings into a sparse, high-dimensional space, significantly reducing memory usage while maintaining retrieval accuracy for large-scale recommender systems.
Contribution
It proposes a novel, lightweight, learnable compression technique that leverages sparsity to enable scalable, resource-efficient retrieval in recommender systems.
Findings
Memory usage is significantly reduced.
Retrieval performance is preserved despite compression.
Sparsity improves scalability of large-scale recommenders.
Abstract
Industry-scale recommender systems face a core challenge: representing entities with high cardinality, such as users or items, using dense embeddings that must be accessible during both training and inference. However, as embedding sizes grow, memory constraints make storage and access increasingly difficult. We describe a lightweight, learnable embedding compression technique that projects dense embeddings into a high-dimensional, sparsely activated space. Designed for retrieval tasks, our method reduces memory requirements while preserving retrieval performance, enabling scalable deployment under strict resource constraints. Our results demonstrate that leveraging sparsity is a promising approach for improving the efficiency of large-scale recommenders. We release our code at https://github.com/recombee/CompresSAE.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Information Retrieval and Search Behavior · Advanced Graph Neural Networks
