FIRMED: A Peak-Centered Multimodal Dataset with Fine-Grained Annotation for Emotion Recognition

Hao Tang; Songyun Xie; Xinzhou Xie; Can Liao; Bohan Li; Zhongyu Tian; Dalu Zheng

arXiv:2507.02350·cs.HC·April 1, 2026

FIRMED: A Peak-Centered Multimodal Dataset with Fine-Grained Annotation for Emotion Recognition

Hao Tang, Songyun Xie, Xinzhou Xie, Can Liao, Bohan Li, Zhongyu Tian, Dalu Zheng

PDF

TL;DR

FIRMED is a new multimodal dataset with fine-grained, event-centered emotion annotations designed to improve temporal accuracy in emotion recognition from physiological signals and facial data.

Contribution

The paper introduces FIRMED, a peak-centered dataset with synchronized multimodal recordings and validated annotations, enabling more precise emotion recognition research.

Findings

01

FIRMED outperforms traditional whole-trial labeling by an average of 3.8 percentage points across classifiers.

02

Multimodal fusion further improves emotion recognition accuracy.

03

Subjective and physiological validation supports annotation quality.

Abstract

Traditional video-induced physiological datasets usually rely on whole-trial labels, which introduce temporal label noise in dynamic emotion recognition. We present FIRMED, a peak-centered multimodal dataset based on an immediate-recall annotation paradigm, with synchronized EEG, ECG, GSR, PPG, and facial recordings from 35 participants. FIRMED provides event-centered timestamps, emotion labels, and intensity annotations, and its annotation quality is supported by subjective and physiological validation. Benchmark experiments show that FIRMED consistently outperforms whole-trial labeling, yielding an average gain of 3.8 percentage points across eight EEG-based classifiers, with further improvements under multimodal fusion. FIRMED provides a practical benchmark for temporally localized supervision in multimodal affective computing.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.