EEG-based Multimodal Representation Learning for Emotion Recognition
Kang Yin, Hye-Bin Shin, Dan Li, Seong-Whan Lee

TL;DR
This paper presents a new multimodal framework that effectively integrates EEG with video, images, and audio for emotion recognition, addressing variability and input size challenges to improve system robustness.
Contribution
It introduces a flexible, attention-based multimodal framework that incorporates EEG data alongside traditional modalities, providing a new benchmark and demonstrating improved emotion recognition performance.
Findings
Effective integration of EEG with other modalities.
Framework handles varying input sizes dynamically.
Benchmark results show improved emotion recognition accuracy.
Abstract
Multimodal learning has been a popular area of research, yet integrating electroencephalogram (EEG) data poses unique challenges due to its inherent variability and limited availability. In this paper, we introduce a novel multimodal framework that accommodates not only conventional modalities such as video, images, and audio, but also incorporates EEG data. Our framework is designed to flexibly handle varying input sizes, while dynamically adjusting attention to account for feature importance across modalities. We evaluate our approach on a recently introduced emotion recognition dataset that combines data from three modalities, making it an ideal testbed for multimodal learning. The experimental results provide a benchmark for the dataset and demonstrate the effectiveness of the proposed framework. This work highlights the potential of integrating EEG into multimodal systems, paving…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition
MethodsSoftmax · Attention Is All You Need
