Enhancing Taobao Display Advertising with Multimodal Representations:   Challenges, Approaches and Insights

Xiang-Rong Sheng; Feifan Yang; Litong Gong; Biao Wang; Zhangming Chan,; Yujing Zhang; Yueyao Cheng; Yong-Nan Zhu; Tiezheng Ge; Han Zhu; Yuning Jiang,; Jian Xu; Bo Zheng

arXiv:2407.19467·cs.IR·July 30, 2024

Enhancing Taobao Display Advertising with Multimodal Representations: Challenges, Approaches and Insights

Xiang-Rong Sheng, Feifan Yang, Litong Gong, Biao Wang, Zhangming Chan,, Yujing Zhang, Yueyao Cheng, Yong-Nan Zhu, Tiezheng Ge, Han Zhu, Yuning Jiang,, Jian Xu, Bo Zheng

PDF

Open Access 1 Datasets

TL;DR

This paper presents a two-phase framework for integrating multimodal data into Taobao's display advertising system, addressing challenges of effectiveness and cost, leading to significant performance improvements.

Contribution

It introduces a novel two-phase approach combining multimodal pre-training and integration with ID-based models for industrial recommendation systems.

Findings

01

Significant performance improvements in Taobao advertising system

02

Effective and cost-efficient multimodal data integration method

03

Insights for practitioners on leveraging multimodal data

Abstract

Despite the recognized potential of multimodal data to improve model accuracy, many large-scale industrial recommendation systems, including Taobao display advertising system, predominantly depend on sparse ID features in their models. In this work, we explore approaches to leverage multimodal data to enhance the recommendation accuracy. We start from identifying the key challenges in adopting multimodal data in a manner that is both effective and cost-efficient for industrial systems. To address these challenges, we introduce a two-phase framework, including: 1) the pre-training of multimodal representations to capture semantic similarity, and 2) the integration of these representations with existing ID-based models. Furthermore, we detail the architecture of our production system, which is designed to facilitate the deployment of multimodal representations. Since the integration of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

TaoBao-MM/Taobao-MM
dataset· 2.4k dl
2.4k dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDigital Communication and Language · Subtitles and Audiovisual Media · Multimedia Communication and Technology