Mixed Dimension Embeddings with Application to Memory-Efficient   Recommendation Systems

Antonio Ginart; Maxim Naumov; Dheevatsa Mudigere; Jiyan Yang; James; Zou

arXiv:1909.11810·cs.LG·February 9, 2021·32 cites

Mixed Dimension Embeddings with Application to Memory-Efficient Recommendation Systems

Antonio Ginart, Maxim Naumov, Dheevatsa Mudigere, Jiyan Yang, James, Zou

PDF

Open Access 5 Repos

TL;DR

This paper introduces mixed dimension embeddings that scale embedding size with query frequency, significantly reducing memory usage in recommendation systems while maintaining or improving accuracy.

Contribution

It proposes a novel mixed dimension embedding architecture with theoretical analysis and empirical validation demonstrating substantial memory savings and performance retention.

Findings

01

Memory usage reduced by up to 16X

02

Accuracy improved by 0.1% on Criteo dataset

03

Maintains performance with fewer parameters

Abstract

Embedding representations power machine intelligence in many applications, including recommendation systems, but they are space intensive -- potentially occupying hundreds of gigabytes in large-scale settings. To help manage this outsized memory consumption, we explore mixed dimension embeddings, an embedding layer architecture in which a particular embedding vector's dimension scales with its query frequency. Through theoretical analysis and systematic experiments, we demonstrate that using mixed dimensions can drastically reduce the memory usage, while maintaining and even improving the ML performance. Empirically, we show that the proposed mixed dimension layers improve accuracy by 0.1% using half as many parameters or maintain it using 16X fewer parameters for click-through rate prediction task on the Criteo Kaggle dataset.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Caching and Content Delivery · Advanced Graph Neural Networks