TL;DR
This paper introduces a hybrid CNN attention module that integrates category prior information to enhance user image behavior modeling for CTR prediction, outperforming traditional two-stage methods.
Contribution
The paper proposes a novel hybrid CNN attention approach that unifies image behaviors with category prior, addressing limitations of existing two-stage models for CTR prediction.
Findings
Significant online and offline performance improvements.
Effective integration of category prior enhances visual feature extraction.
Outperforms traditional two-stage CNN-based models.
Abstract
User historical behaviors are proved useful for Click Through Rate (CTR) prediction in online advertising system. In Meituan, one of the largest e-commerce platform in China, an item is typically displayed with its image and whether a user clicks the item or not is usually influenced by its image, which implies that user's image behaviors are helpful for understanding user's visual preference and improving the accuracy of CTR prediction. Existing user image behavior models typically use a two-stage architecture, which extracts visual embeddings of images through off-the-shelf Convolutional Neural Networks (CNNs) in the first stage, and then jointly trains a CTR model with those visual embeddings and non-visual features. We find that the two-stage architecture is sub-optimal for CTR prediction. Meanwhile, precisely labeled categories in online ad systems contain abundant visual prior…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
