Model Size Reduction Using Frequency Based Double Hashing for Recommender Systems
Caojin Zhang, Yicun Liu, Yuanpu Xie, Sofia Ira Ktena, Alykhan Tejani,, Akshay Gupta, Pranay Kumar Myana, Deepak Dilipkumar, Suvadip Paul, Ikuhiro, Ihara, Prasang Upadhyaya, Ferenc Huszar, Wenzhe Shi

TL;DR
This paper introduces a hybrid hashing technique combining frequency hashing and double hashing to significantly reduce the size of recommender system models by around 90% without losing performance.
Contribution
It presents a novel hybrid hashing method that effectively compresses large DNN models in recommender systems while maintaining their accuracy.
Findings
Model size reduced by approximately 90%.
Performance remains comparable to baseline models.
Effective on multiple product surfaces.
Abstract
Deep Neural Networks (DNNs) with sparse input features have been widely used in recommender systems in industry. These models have large memory requirements and need a huge amount of training data. The large model size usually entails a cost, in the range of millions of dollars, for storage and communication with the inference services. In this paper, we propose a hybrid hashing method to combine frequency hashing and double hashing techniques for model size reduction, without compromising performance. We evaluate the proposed models on two product surfaces. In both cases, experiment results demonstrated that we can reduce the model size by around 90 % while keeping the performance on par with the original baselines.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Video Analysis and Summarization
