FIRMED: A Peak-Centered Multimodal Dataset with Fine-Grained Annotation for Emotion Recognition
Hao Tang, Songyun Xie, Xinzhou Xie, Can Liao, Bohan Li, Zhongyu Tian, Dalu Zheng

TL;DR
FIRMED is a new multimodal dataset with fine-grained, event-centered emotion annotations designed to improve temporal accuracy in emotion recognition from physiological signals and facial data.
Contribution
The paper introduces FIRMED, a peak-centered dataset with synchronized multimodal recordings and validated annotations, enabling more precise emotion recognition research.
Findings
FIRMED outperforms traditional whole-trial labeling by an average of 3.8 percentage points across classifiers.
Multimodal fusion further improves emotion recognition accuracy.
Subjective and physiological validation supports annotation quality.
Abstract
Traditional video-induced physiological datasets usually rely on whole-trial labels, which introduce temporal label noise in dynamic emotion recognition. We present FIRMED, a peak-centered multimodal dataset based on an immediate-recall annotation paradigm, with synchronized EEG, ECG, GSR, PPG, and facial recordings from 35 participants. FIRMED provides event-centered timestamps, emotion labels, and intensity annotations, and its annotation quality is supported by subjective and physiological validation. Benchmark experiments show that FIRMED consistently outperforms whole-trial labeling, yielding an average gain of 3.8 percentage points across eight EEG-based classifiers, with further improvements under multimodal fusion. FIRMED provides a practical benchmark for temporally localized supervision in multimodal affective computing.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
