Representation Learning for Image-based Music Recommendation
Chih-Chun Hsia, Kwei-Herng Lai, Yian Chen, Chuan-Ju Wang, Ming-Feng, Tsai

TL;DR
This paper introduces a novel representation learning framework that uses images to improve music recommendation by bridging the gap between image and music data, demonstrated through image-to-song retrieval tasks.
Contribution
It presents a new method for image-based music recommendation that effectively connects visual and auditory data for contextual suggestions.
Findings
Successfully retrieves relevant songs for input images
Bridges heterogeneity gap between image and music data
Enhances contextual music recommendation capabilities
Abstract
Image perception is one of the most direct ways to provide contextual information about a user concerning his/her surrounding environment; hence images are a suitable proxy for contextual recommendation. We propose a novel representation learning framework for image-based music recommendation that bridges the heterogeneity gap between music and image data; the proposed method is a key component for various contextual recommendation tasks. Preliminary experiments show that for an image-to-song retrieval task, the proposed method retrieves relevant or conceptually similar songs for input images.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Advanced Image and Video Retrieval Techniques · Image Retrieval and Classification Techniques
