Memo2496: Expert-Annotated Dataset and Dual-View Adaptive Framework for Music Emotion Recognition
Qilin Li, C. L. Philip Chen, Tong Zhang

TL;DR
This paper introduces Memo2496, a high-quality annotated music emotion dataset, and DAMER, a dual-view adaptive framework that improves music emotion recognition by addressing feature drift and leveraging multi-modal data.
Contribution
The work presents a large-scale, expert-annotated dataset and a novel dual-view adaptive model with three modules, advancing the accuracy and robustness of music emotion recognition.
Findings
DAMER achieves state-of-the-art performance on multiple datasets.
The dataset ensures high annotation quality through calibration and consistency checks.
Each module of DAMER significantly contributes to performance improvements.
Abstract
Music Emotion Recogniser (MER) research faces challenges due to limited high-quality annotated datasets and difficulties in addressing cross-track feature drift. This work presents two primary contributions to address these issues. Memo2496, a large-scale dataset, offers 2496 instrumental music tracks with continuous valence arousal labels, annotated by 30 certified music specialists. Annotation quality is ensured through calibration with extreme emotion exemplars and a consistency threshold of 0.25, measured by Euclidean distance in the valence arousal space. Furthermore, the Dual-view Adaptive Music Emotion Recogniser (DAMER) is introduced. DAMER integrates three synergistic modules: Dual Stream Attention Fusion (DSAF) facilitates token-level bidirectional interaction between Mel spectrograms and cochleagrams via cross attention mechanisms; Progressive Confidence Labelling (PCL)…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Emotion and Mood Recognition · Neuroscience and Music Perception
