M3ED: Multi-modal Multi-scene Multi-label Emotional Dialogue Database
Jinming Zhao, Tenggan Zhang, Jingwen Hu, Yuchen Liu, Qin Jin, Xinchao, Wang, Haizhou Li

TL;DR
M3ED is a comprehensive multimodal Chinese emotional dialogue dataset with 990 dialogues across 7 emotion categories, designed to advance cross-cultural emotion analysis and recognition.
Contribution
The paper introduces M3ED, the first large-scale multimodal Chinese emotional dialogue dataset, and proposes the MDI framework for dialogue context modeling in emotion recognition.
Findings
M3ED contains 9,082 dialogue turns and 24,449 utterances.
State-of-the-art methods perform well on M3ED, validating its quality.
The MDI framework achieves competitive results in emotion recognition.
Abstract
The emotional state of a speaker can be influenced by many different factors in dialogues, such as dialogue scene, dialogue topic, and interlocutor stimulus. The currently available data resources to support such multimodal affective analysis in dialogues are however limited in scale and diversity. In this work, we propose a Multi-modal Multi-scene Multi-label Emotional Dialogue dataset, M3ED, which contains 990 dyadic emotional dialogues from 56 different TV series, a total of 9,082 turns and 24,449 utterances. M3 ED is annotated with 7 emotion categories (happy, surprise, sad, disgust, anger, fear, and neutral) at utterance level, and encompasses acoustic, visual, and textual modalities. To the best of our knowledge, M3ED is the first multimodal emotional dialogue dataset in Chinese. It is valuable for cross-culture emotion analysis and recognition. We apply several state-of-the-art…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSentiment Analysis and Opinion Mining · Speech and dialogue systems · Emotion and Mood Recognition
