MIM: Multi-modal Content Interest Modeling Paradigm for User Behavior Modeling
Bencheng Yan, Si Chen, Shichang Jia, Jianyu Liu, Yueran Liu, Chenghan, Fu, Wanxian Guan, Hui Zhao, Xiang Zhang, Kai Zhang, Wenbo Su, Pengjie Wang,, Jian Xu, Bo Zheng, Baolin Liu

TL;DR
This paper introduces MIM, a multi-modal content interest modeling framework that improves user interest prediction by integrating content modalities and user behavior signals, significantly enhancing CTR in large-scale e-commerce.
Contribution
The paper presents a novel multi-modal interest modeling paradigm with a three-stage process, effectively addressing limitations of ID embeddings in cold-start and long-tail scenarios.
Findings
Achieved +14.14% CTR increase in online deployment.
Demonstrated superior performance over traditional ID embedding methods.
Successfully applied in Taobao's large-scale e-commerce platform.
Abstract
Click-Through Rate (CTR) prediction is a crucial task in recommendation systems, online searches, and advertising platforms, where accurately capturing users' real interests in content is essential for performance. However, existing methods heavily rely on ID embeddings, which fail to reflect users' true preferences for content such as images and titles. This limitation becomes particularly evident in cold-start and long-tail scenarios, where traditional approaches struggle to deliver effective results. To address these challenges, we propose a novel Multi-modal Content Interest Modeling paradigm (MIM), which consists of three key stages: Pre-training, Content-Interest-Aware Supervised Fine-Tuning (C-SFT), and Content-Interest-Aware UBM (CiUBM). The pre-training stage adapts foundational models to domain-specific data, enabling the extraction of high-quality multi-modal embeddings. The…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsRecommender Systems and Techniques · Digital Marketing and Social Media · Web Data Mining and Analysis
MethodsMutual Information Machine/Mask Image Modeling
