MAP: A Model-agnostic Pretraining Framework for Click-through Rate   Prediction

Jianghao Lin; Yanru Qu; Wei Guo; Xinyi Dai; Ruiming Tang; Yong Yu,; Weinan Zhang

arXiv:2308.01737·cs.IR·August 4, 2023

MAP: A Model-agnostic Pretraining Framework for Click-through Rate Prediction

Jianghao Lin, Yanru Qu, Wei Guo, Xinyi Dai, Ruiming Tang, Yong Yu,, Weinan Zhang

PDF

Open Access 1 Repo

TL;DR

This paper introduces a model-agnostic pretraining framework for CTR prediction that leverages self-supervised learning through feature corruption and recovery, significantly improving performance on large-scale datasets.

Contribution

It proposes a novel self-supervised pretraining framework with two algorithms, MFP and RFD, tailored for multi-field categorical data in CTR prediction tasks.

Findings

01

Achieves state-of-the-art results on Avazu and Criteo datasets.

02

Enhances model effectiveness and efficiency with the proposed pretraining methods.

03

Demonstrates compatibility with various backbone models like DCNv2 and DeepFM.

Abstract

With the widespread application of personalized online services, click-through rate (CTR) prediction has received more and more attention and research. The most prominent features of CTR prediction are its multi-field categorical data format, and vast and daily-growing data volume. The large capacity of neural models helps digest such massive amounts of data under the supervised learning paradigm, yet they fail to utilize the substantial data to its full potential, since the 1-bit click signal is not sufficient to guide the model to learn capable representations of features and instances. The self-supervised learning paradigm provides a more promising pretrain-finetune solution to better exploit the large amount of user click logs, and learn more generalized and effective representations. However, self-supervised learning for CTR prediction is still an open question, since current works…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

chiangel/map-code
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Computing and Algorithms · Image and Video Quality Assessment · Recommender Systems and Techniques

Methodsfail