ScaleFreeCTR: MixCache-based Distributed Training System for CTR Models   with Huge Embedding Table

Huifeng Guo; Wei Guo; Yong Gao; Ruiming Tang; Xiuqiang He; Wenzhi Liu

arXiv:2104.08542·cs.IR·May 12, 2021

ScaleFreeCTR: MixCache-based Distributed Training System for CTR Models with Huge Embedding Table

Huifeng Guo, Wei Guo, Yong Gao, Ruiming Tang, Xiuqiang He, Wenzhi Liu

PDF

Open Access 1 Repo

TL;DR

ScaleFreeCTR introduces a distributed training system for large-scale CTR models that significantly reduces data transfer latency and improves training efficiency by leveraging GPU-based synchronization and innovative caching mechanisms.

Contribution

The paper proposes ScaleFreeCTR, a novel MixCache-based system that shifts embedding synchronization to GPUs, reducing latency and overcoming CPU bottlenecks in training huge CTR models.

Findings

01

Reduces data transfer latency between GPU and CPU.

02

Improves training efficiency for large embedding tables.

03

Demonstrates effectiveness through comprehensive experiments.

Abstract

Because of the superior feature representation ability of deep learning, various deep Click-Through Rate (CTR) models are deployed in the commercial systems by industrial companies. To achieve better performance, it is necessary to train the deep CTR models on huge volume of training data efficiently, which makes speeding up the training process an essential problem. Different from the models with dense training data, the training data for CTR models is usually high-dimensional and sparse. To transform the high-dimensional sparse input into low-dimensional dense real-value vectors, almost all deep CTR models adopt the embedding layer, which easily reaches hundreds of GB or even TB. Since a single GPU cannot afford to accommodate all the embedding parameters, when performing distributed training, it is not reasonable to conduct the data-parallelism only. Therefore, existing distributed…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mindspore-ai/models/tree/master/official/recommend/wide_and_deep
mindspore

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRecommender Systems and Techniques · Advanced Computing and Algorithms · Caching and Content Delivery