Multimodal Topic Learning for Video Recommendation
Shi Pu, Yijiang He, Zheng Li, Mao Zheng

TL;DR
This paper introduces a multimodal topic learning approach that explicitly generates semantic video topics offline from tags, titles, and images, improving recommendation accuracy and efficiency in video streaming platforms.
Contribution
It proposes a novel multimodal topic learning algorithm that separates topic generation from recommendation, reducing online computation and enhancing recommendation quality.
Findings
The algorithm effectively generates semantic video topics offline.
Using semantic topics improves recommendation accuracy.
Reduces online computational cost significantly.
Abstract
Facilitated by deep neural networks, video recommendation systems have made significant advances. Existing video recommendation systems directly exploit features from different modalities (e.g., user personal data, user behavior data, video titles, video tags, and visual contents) to input deep neural networks, while expecting the networks to online mine user-preferred topics implicitly from these features. However, the features lacking semantic topic information limits accurate recommendation generation. In addition, feature crosses using visual content features generate high dimensionality features that heavily downgrade the online computational efficiency of networks. In this paper, we explicitly separate topic generation from recommendation generation, propose a multimodal topic learning algorithm to exploit three modalities (i.e., tags, titles, and cover images) for generating…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · Video Analysis and Summarization · Recommender Systems and Techniques
