Content-based feature exploration for transparent music recommendation using self-attentive genre classification
Seungjin Lee, Juheon Lee, Kyogu lee

TL;DR
This paper introduces a method for enhancing interpretability in music recommendation systems by using self-attention to analyze lyric and acoustic features, enabling better understanding of content-based suggestions.
Contribution
It proposes a novel approach employing self-attention models to extract and visualize interpretable content features from lyrics and audio for music recommendation.
Findings
Self-attention models effectively highlight content features.
Visualizations help users understand recommendation rationale.
Similar song retrieval demonstrates interpretability of features.
Abstract
Interpretation of retrieved results is an important issue in music recommender systems, particularly from a user perspective. In this study, we investigate the methods for providing interpretability of content features using self-attention. We extract lyric features with the self-attentive genre classification model trained on 140,000 tracks of lyrics. Likewise, we extract acoustic features using the acoustic model with self-attention trained on 120,000 tracks of acoustic signals. The experimental results show that the proposed methods provide the characteristics that are interpretable in terms of both lyrical and musical contents. We demonstrate this by visualizing the attention weights, and by presenting the most similar songs found using lyric or audio features.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing
