ScaleFreeCTR: MixCache-based Distributed Training System for CTR Models with Huge Embedding Table
Huifeng Guo, Wei Guo, Yong Gao, Ruiming Tang, Xiuqiang He, Wenzhi Liu

TL;DR
ScaleFreeCTR introduces a distributed training system for large-scale CTR models that significantly reduces data transfer latency and improves training efficiency by leveraging GPU-based synchronization and innovative caching mechanisms.
Contribution
The paper proposes ScaleFreeCTR, a novel MixCache-based system that shifts embedding synchronization to GPUs, reducing latency and overcoming CPU bottlenecks in training huge CTR models.
Findings
Reduces data transfer latency between GPU and CPU.
Improves training efficiency for large embedding tables.
Demonstrates effectiveness through comprehensive experiments.
Abstract
Because of the superior feature representation ability of deep learning, various deep Click-Through Rate (CTR) models are deployed in the commercial systems by industrial companies. To achieve better performance, it is necessary to train the deep CTR models on huge volume of training data efficiently, which makes speeding up the training process an essential problem. Different from the models with dense training data, the training data for CTR models is usually high-dimensional and sparse. To transform the high-dimensional sparse input into low-dimensional dense real-value vectors, almost all deep CTR models adopt the embedding layer, which easily reaches hundreds of GB or even TB. Since a single GPU cannot afford to accommodate all the embedding parameters, when performing distributed training, it is not reasonable to conduct the data-parallelism only. Therefore, existing distributed…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Computing and Algorithms · Caching and Content Delivery
