Dreaming User Multimodal Representation Guided by The Platonic   Representation Hypothesis for Micro-Video Recommendation

Chengzhi Lin; Hezheng Lin; Shuchang Liu; Cangguang Ruan; LingJing Xu,; Dezhao Yang; Chuyuan Wang; Yongqi Liu

arXiv:2410.03538·cs.IR·October 22, 2024

Dreaming User Multimodal Representation Guided by The Platonic Representation Hypothesis for Micro-Video Recommendation

Chengzhi Lin, Hezheng Lin, Shuchang Liu, Cangguang Ruan, LingJing Xu,, Dezhao Yang, Chuyuan Wang, Yongqi Liu

PDF

Open Access

TL;DR

This paper introduces DreamUMM, a multimodal user representation method inspired by the Platonic Representation Hypothesis, which improves micro-video recommendations by capturing dynamic user interests in a shared multimodal space, validated through large-scale online tests.

Contribution

We propose DreamUMM, a novel real-time multimodal user representation approach based on a closed-form solution, and Candidate-DreamUMM for cold-start scenarios, demonstrating practical scalability and effectiveness.

Findings

01

Significant improvements in user engagement metrics.

02

Successful deployment on platforms with hundreds of millions of users.

03

Empirical evidence supporting multimodal interest convergence.

Abstract

The proliferation of online micro-video platforms has underscored the necessity for advanced recommender systems to mitigate information overload and deliver tailored content. Despite advancements, accurately and promptly capturing dynamic user interests remains a formidable challenge. Inspired by the Platonic Representation Hypothesis, which posits that different data modalities converge towards a shared statistical model of reality, we introduce DreamUMM (Dreaming User Multi-Modal Representation), a novel approach leveraging user historical behaviors to create real-time user representation in a multimoda space. DreamUMM employs a closed-form solution correlating user video preferences with multimodal similarity, hypothesizing that user interests can be effectively represented in a unified multimodal space. Additionally, we propose Candidate-DreamUMM for scenarios lacking recent user…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Mobility and Location-Based Analysis