Scalable Machine Learning Training Infrastructure for Online Ads   Recommendation and Auction Scoring Modeling at Google

George Kurian; Somayeh Sardashti; Ryan Sims; Felix Berger; Gary Holt,; Yang Li; Jeremiah Willcock; Kaiyuan Wang; Herve Quiroz; Abdulrahman Salem,; Julian Grady

arXiv:2501.10546·cs.DC·January 22, 2025

Scalable Machine Learning Training Infrastructure for Online Ads Recommendation and Auction Scoring Modeling at Google

George Kurian, Somayeh Sardashti, Ryan Sims, Felix Berger, Gary Holt,, Yang Li, Jeremiah Willcock, Kaiyuan Wang, Herve Quiroz, Abdulrahman Salem,, Julian Grady

PDF

Open Access

TL;DR

This paper presents scalable infrastructure solutions for large-scale online ads machine learning at Google, addressing input processing, embedding optimization, and error handling, resulting in significant performance and cost improvements.

Contribution

It introduces novel techniques for input generation, embedding optimization, and resource management that enhance efficiency and scalability in production ML systems.

Findings

01

116% performance boost in production models

02

18% reduction in training costs

03

Effective handling of large-scale data and errors

Abstract

Large-scale Ads recommendation and auction scoring models at Google scale demand immense computational resources. While specialized hardware like TPUs have improved linear algebra computations, bottlenecks persist in large-scale systems. This paper proposes solutions for three critical challenges that must be addressed for efficient end-to-end execution in a widely used production infrastructure: (1) Input Generation and Ingestion Pipeline: Efficiently transforming raw features (e.g., "search query") into numerical inputs and streaming them to TPUs; (2) Large Embedding Tables: Optimizing conversion of sparse features into dense floating-point vectors for neural network consumption; (3) Interruptions and Error Handling: Minimizing resource wastage in large-scale shared datacenters. To tackle these challenges, we propose a shared input generation technique to reduce computational load of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Imbalanced Data Classification Techniques · Big Data and Business Intelligence