Multimodal Pretraining and Generation for Recommendation: A Tutorial
Jieming Zhu, Chuhan Wu, Rui Zhang, Zhenhua Dong

TL;DR
This tutorial reviews recent advancements in multimodal pretraining and generation techniques to enhance recommendation systems beyond traditional ID-based methods, addressing challenges and future directions in multimedia applications.
Contribution
It provides a comprehensive overview of the latest multimodal pretraining and generation methods for recommendation systems, highlighting new techniques and industrial applications.
Findings
Multimodal pretraining improves recommendation accuracy across diverse media.
Generation techniques enable personalized content creation in recommendation systems.
Open challenges include data scarcity and model scalability.
Abstract
Personalized recommendation stands as a ubiquitous channel for users to explore information or items aligned with their interests. Nevertheless, prevailing recommendation models predominantly rely on unique IDs and categorical features for user-item matching. While this ID-centric approach has witnessed considerable success, it falls short in comprehensively grasping the essence of raw item contents across diverse modalities, such as text, image, audio, and video. This underutilization of multimodal data poses a limitation to recommender systems, particularly in the realm of multimedia services like news, music, and short-video platforms. The recent surge in pretraining and generation techniques presents both opportunities and challenges in the development of multimodal recommender systems. This tutorial seeks to provide a thorough exploration of the latest advancements and future…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Speech and dialogue systems · Advanced Text Analysis Techniques
