TL;DR
HET is a scalable distributed framework for training large embedding models that leverages embedding cache and a new consistency model to significantly reduce communication overhead and improve training speed.
Contribution
HET introduces a cache-enabled distributed system with a novel consistency model to enhance scalability of huge embedding model training.
Findings
Achieves up to 88% reduction in embedding communication.
Realizes up to 20.68x speedup over baselines.
Effectively handles skewed embedding popularity distributions.
Abstract
Embedding models have been an effective learning paradigm for high-dimensional data. However, one open issue of embedding models is that their representations (latent factors) often result in large parameter space. We observe that existing distributed training frameworks face a scalability issue of embedding models since updating and retrieving the shared embedding parameters from servers usually dominates the training cycle. In this paper, we propose HET, a new system framework that significantly improves the scalability of huge embedding model training. We embrace skewed popularity distributions of embeddings as a performance opportunity and leverage it to address the communication bottleneck with an embedding cache. To ensure consistency across the caches, we incorporate a new consistency model into HET design, which provides fine-grained consistency guarantees on a per-embedding…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
