Two-level Explanations in Music Emotion Recognition

Verena Haunschmid; Shreyan Chowdhury; Gerhard Widmer

arXiv:1905.11760·cs.SD·May 29, 2019·5 cites

Two-level Explanations in Music Emotion Recognition

Verena Haunschmid, Shreyan Chowdhury, Gerhard Widmer

PDF

Open Access

TL;DR

This paper introduces a two-step explanation method for music emotion recognition models, linking audio features to perceptual features and then to emotion predictions, enhancing interpretability.

Contribution

It proposes a novel two-level explanation approach that connects spectrogram features to perceptual and emotional outcomes, improving interpretability of ML models in music emotion recognition.

Findings

01

Enables focus on specific musical reasons for predictions

02

Allows visual and acoustic interpretation of influential audio patterns

03

Improves understanding of model decision processes

Abstract

Current ML models for music emotion recognition, while generally working quite well, do not give meaningful or intuitive explanations for their predictions. In this work, we propose a 2-step procedure to arrive at spectrogram-level explanations that connect certain aspects of the audio to interpretable mid-level perceptual features, and these to the actual emotion prediction. That makes it possible to focus on specific musical reasons for a prediction (in terms of perceptual features), and to trace these back to patterns in the audio that can be interpreted visually and acoustically.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Model Reduction and Neural Networks · Neural Networks and Applications