Music Emotion Prediction Using Recurrent Neural Networks
Xinyu Chang, Xiangyu Zhang, Haoruo Zhang, Yulu Ran

TL;DR
This paper investigates using recurrent neural networks to classify music emotions based on audio features, aiming to improve music recommendation and therapeutic applications by capturing emotional nuances.
Contribution
It introduces a neural network approach utilizing audio features and emotional quadrants, demonstrating effectiveness in emotion prediction with smaller datasets.
Findings
RNNs can effectively predict music emotions from audio features.
Simpler RNN architectures perform comparably or better than complex models on small datasets.
Models trained on augmented and external datasets show promising results.
Abstract
This study explores the application of recurrent neural networks to recognize emotions conveyed in music, aiming to enhance music recommendation systems and support therapeutic interventions by tailoring music to fit listeners' emotional states. We utilize Russell's Emotion Quadrant to categorize music into four distinct emotional regions and develop models capable of accurately predicting these categories. Our approach involves extracting a comprehensive set of audio features using Librosa and applying various recurrent neural network architectures, including standard RNNs, Bidirectional RNNs, and Long Short-Term Memory (LSTM) networks. Initial experiments are conducted using a dataset of 900 audio clips, labeled according to the emotional quadrants. We compare the performance of our neural network models against a set of baseline classifiers and analyze their effectiveness in…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing
MethodsSparse Evolutionary Training
