Are We There Yet? A Brief Survey of Music Emotion Prediction Datasets, Models and Outstanding Challenges
Jaeyong Kang, Dorien Herremans

TL;DR
This survey reviews music emotion prediction datasets, models, and challenges, emphasizing the need for standardized benchmarks, larger datasets, and better model interpretability to advance the field.
Contribution
It provides a comprehensive overview of existing datasets, models, and challenges in music emotion prediction, highlighting key issues and future directions.
Findings
Persistent challenges include dataset quality and annotation ambiguity.
Cross-dataset generalization remains difficult.
Standardized benchmarks and larger datasets are needed.
Abstract
Deep learning models for music have advanced drastically in recent years, but how good are machine learning models at capturing emotion, and what challenges are researchers facing? In this paper, we provide a comprehensive overview of the available music-emotion datasets and discuss evaluation standards as well as competitions in the field. We also offer a brief overview of various types of music emotion prediction models that have been built over the years, providing insights into the diverse approaches within the field. Through this examination, we highlight the challenges that persist in accurately capturing emotion in music, including issues related to dataset quality, annotation consistency, and model generalization. Additionally, we explore the impact of different modalities, such as audio, MIDI, and physiological signals, on the effectiveness of emotion prediction models. Through…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing
