FeatureBox: Feature Engineering on GPUs for Massive-Scale Ads Systems
Weijie Zhao, Xuewu Jiao, Xinsheng Luo, Jingxue Li, Belhal Karimi, Ping, Li

TL;DR
FeatureBox is an end-to-end GPU-based framework that accelerates feature extraction and training for large-scale online ad CTR models, significantly reducing training time by optimizing computation and memory management.
Contribution
It introduces a novel GPU-accelerated feature extraction pipeline with a layer-wise scheduling algorithm and dynamic memory management for industrial-scale CTR model training.
Findings
Achieves faster training times compared to traditional CPU-based methods.
Effectively reduces intermediate I/O and memory overhead.
Demonstrates significant performance improvements on real-world ad applications.
Abstract
Deep learning has been widely deployed for online ads systems to predict Click-Through Rate (CTR). Machine learning researchers and practitioners frequently retrain CTR models to test their new extracted features. However, the CTR model training often relies on a large number of raw input data logs. Hence, the feature extraction can take a significant proportion of the training time for an industrial-level CTR model. In this paper, we propose FeatureBox, a novel end-to-end training framework that pipelines the feature extraction and the training on GPU servers to save the intermediate I/O of the feature extraction. We rewrite computation-intensive feature extraction operators as GPU operators and leave the memory-intensive operator on CPUs. We introduce a layer-wise operator scheduling algorithm to schedule these heterogeneous operators. We present a light-weight GPU memory management…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Advanced Bandit Algorithms Research · Image and Video Quality Assessment
MethodsTest
