Music Mood Detection Based On Audio And Lyrics With Deep Neural Net

R\'emi Delbouys; Romain Hennequin; Francesco Piccoli; Jimena; Royo-Letelier; Manuel Moussallam

arXiv:1809.07276·cs.IR·September 21, 2018·31 cites

Music Mood Detection Based On Audio And Lyrics With Deep Neural Net

R\'emi Delbouys, Romain Hennequin, Francesco Piccoli, Jimena, Royo-Letelier, Manuel Moussallam

PDF

Open Access 1 Repo

TL;DR

This paper presents a deep learning approach for multimodal music mood prediction using audio and lyrics, outperforming traditional methods on arousal detection and improving valence prediction through optimized modality fusion.

Contribution

It introduces a novel deep learning model for music mood prediction and demonstrates its superiority over classical feature engineering approaches.

Findings

01

Deep learning outperforms traditional models on arousal detection.

02

Both approaches perform equally on valence prediction.

03

Optimized multimodal fusion significantly improves valence prediction.

Abstract

We consider the task of multimodal music mood prediction based on the audio signal and the lyrics of a track. We reproduce the implementation of traditional feature engineering based approaches and propose a new model based on deep learning. We compare the performance of both approaches on a database containing 18,000 tracks with associated valence and arousal values and show that our approach outperforms classical models on the arousal detection task, and that both approaches perform equally on the valence prediction task. We also compare the a posteriori fusion with fusion of modalities optimized simultaneously with each unimodal model, and observe a significant improvement of valence prediction. We release part of our database for comparison purposes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Dohppak/Music-Emotion-Recognition-Classification
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies