Exploiting multi-CNN features in CNN-RNN based Dimensional Emotion   Recognition on the OMG in-the-wild Dataset

Dimitrios Kollias; Stefanos Zafeiriou

arXiv:1910.01417·cs.LG·April 13, 2020

Exploiting multi-CNN features in CNN-RNN based Dimensional Emotion Recognition on the OMG in-the-wild Dataset

Dimitrios Kollias, Stefanos Zafeiriou

PDF

TL;DR

This paper introduces a CNN-RNN framework that leverages multi-level CNN features for dimensional emotion recognition in-the-wild, achieving state-of-the-art results using only visual data on the OMG-Emotion dataset.

Contribution

It proposes a novel multi-level feature extraction and fusion approach within CNN-RNN models, enhancing emotion recognition performance in challenging real-world scenarios.

Findings

01

Outperformed state-of-the-art methods using only visual data.

02

Achieved second place in OMG-Emotion Challenge for valence estimation.

03

Combining low- and high-level features significantly improves arousal estimation.

Abstract

This paper presents a novel CNN-RNN based approach, which exploits multiple CNN features for dimensional emotion recognition in-the-wild, utilizing the One-Minute Gradual-Emotion (OMG-Emotion) dataset. Our approach includes first pre-training with the relevant and large in size, Aff-Wild and Aff-Wild2 emotion databases. Low-, mid- and high-level features are extracted from the trained CNN component and are exploited by RNN subnets in a multi-task framework. Their outputs constitute an intermediate level prediction; final estimates are obtained as the mean or median values of these predictions. Fusion of the networks is also examined for boosting the obtained performance, at Decision-, or at Model-level; in the latter case a RNN was used for the fusion. Our approach, although using only the visual modality, outperformed state-of-the-art methods that utilized audio and visual modalities.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.