Cross-Modal Content Inference and Feature Enrichment for Cold-Start Recommendation
Haokai Ma, Zhuang Qi, Xinxin Dong, Xiangxian Li, Yuze Zheng, Xiangxu, Mengand Lei Meng

TL;DR
This paper introduces CIERec, a novel recommendation framework that leverages cross-modal content inference and feature enrichment to significantly enhance cold-start recommendation performance, especially when multi-modal data is limited.
Contribution
CIERec is the first framework to utilize privileged image annotation and cross-modal inference for improving cold-start recommendations, demonstrating universality across different backbones.
Findings
CIERec outperforms existing visually-aware algorithms in cold-start scenarios.
It achieves consistent improvements across multiple datasets.
The framework is effective with various backbone models.
Abstract
Multimedia recommendation aims to fuse the multi-modal information of items for feature enrichment to improve the recommendation performance. However, existing methods typically introduce multi-modal information based on collaborative information to improve the overall recommendation precision, while failing to explore its cold-start recommendation performance. Meanwhile, these above methods are only applicable when such multi-modal data is available. To address this problem, this paper proposes a recommendation framework, named Cross-modal Content Inference and Feature Enrichment Recommendation (CIERec), which exploits the multi-modal information to improve its cold-start recommendation performance. Specifically, CIERec first introduces image annotation as the privileged information to help guide the mapping of unified features from the visual space to the semantic space in the…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Image and Video Retrieval Techniques · Multimodal Machine Learning Applications · Visual Attention and Saliency Detection
