Temporal Multimodal Fusion for Video Emotion Classification in the Wild

Valentin Vielzeuf; St\'ephane Pateux; Fr\'ed\'eric Jurie

arXiv:1709.07200·cs.CV·September 22, 2017

Temporal Multimodal Fusion for Video Emotion Classification in the Wild

Valentin Vielzeuf, St\'ephane Pateux, Fr\'ed\'eric Jurie

PDF

TL;DR

This paper proposes a novel multimodal and temporal fusion approach for video emotion classification, introducing improved face descriptors and a hierarchical fusion method, achieving competitive results on the Emotion in the Wild challenge.

Contribution

It introduces new face descriptors, a hierarchical fusion method, and a CNN architecture tailored for small datasets in video emotion classification.

Findings

01

Achieved 58.8% accuracy on the Emotion in the Wild challenge.

02

Ranked 4th in the 2017 challenge.

03

Demonstrated the effectiveness of hierarchical multimodal fusion.

Abstract

This paper addresses the question of emotion classification. The task consists in predicting emotion labels (taken among a set of possible labels) best describing the emotions contained in short video clips. Building on a standard framework -- lying in describing videos by audio and visual features used by a supervised classifier to infer the labels -- this paper investigates several novel directions. First of all, improved face descriptors based on 2D and 3D Convo-lutional Neural Networks are proposed. Second, the paper explores several fusion methods, temporal and multimodal, including a novel hierarchical method combining features and scores. In addition, we carefully reviewed the different stages of the pipeline and designed a CNN architecture adapted to the task; this is important as the size of the training set is small compared to the difficulty of the problem, making…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.