Addressing Emotion Bias in Music Emotion Recognition and Generation with Frechet Audio Distance
Yuanchao Li, Azalea Gui, Dimitra Emmanouilidou, Hannes Gamper

TL;DR
This paper investigates emotion bias in music emotion recognition and generation, proposing the use of Frechet Audio Distance with multiple encoders for more objective evaluation and introducing an improved emotional music generation method.
Contribution
It introduces a novel evaluation approach using FAD with diverse encoders and enhances EMG to produce more realistic and varied musical emotions.
Findings
FAD with multiple encoders provides more objective MER evaluation
Enhanced EMG improves emotional variability and realism
Emotion bias affects both recognition and generation tasks
Abstract
The complex nature of musical emotion introduces inherent bias in both recognition and generation, particularly when relying on a single audio encoder, emotion classifier, or evaluation metric. In this work, we conduct a study on Music Emotion Recognition (MER) and Emotional Music Generation (EMG), employing diverse audio encoders alongside Frechet Audio Distance (FAD), a reference-free evaluation metric. Our study begins with a benchmark evaluation of MER, highlighting the limitations of using a single audio encoder and the disparities observed across different measurements. We then propose assessing MER performance using FAD derived from multiple encoders to provide a more objective measure of musical emotion. Furthermore, we introduce an enhanced EMG approach designed to improve both the variability and prominence of generated musical emotion, thereby enhancing its realism.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Advanced Adaptive Filtering Techniques
