Toward a benchmark for CTR prediction in online advertising: datasets, evaluation protocols and perspectives
Shan Gao, Yanwu Yang

TL;DR
This paper introduces a comprehensive benchmark platform for CTR prediction in online advertising, including datasets, evaluation protocols, and a comparative study of various models, highlighting key performance insights and data efficiency of LLM-based approaches.
Contribution
The paper develops a unified CTR prediction benchmark platform with standardized evaluation protocols and conducts extensive comparative experiments on diverse models and datasets.
Findings
High-order models outperform low-order models on various metrics.
LLM-based models achieve comparable performance with only 2% of training data.
CTR prediction performance improved significantly from 2015 to 2016, then plateaued.
Abstract
This research designs a unified architecture of CTR prediction benchmark (Bench-CTR) platform that offers flexible interfaces with datasets and components of a wide range of CTR prediction models. Moreover, we construct a comprehensive system of evaluation protocols encompassing real-world and synthetic datasets, a taxonomy of metrics, standardized procedures and experimental guidelines for calibrating the performance of CTR prediction models. Furthermore, we implement the proposed benchmark platform and conduct a comparative study to evaluate a wide range of state-of-the-art models from traditional multivariate statistical to modern large language model (LLM)-based approaches on three public datasets and two synthetic datasets. Experimental results reveal that, (1) high-order models largely outperform low-order models, though such advantage varies in terms of metrics and on different…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsConsumer Market Behavior and Pricing · Digital Marketing and Social Media · Recommender Systems and Techniques
