Convolutional neural networks pretrained on large face recognition   datasets for emotion classification from video

Boris Knyazev; Roman Shvetsov; Natalia Efremova; Artem Kuharenko

arXiv:1711.04598·cs.CV·November 15, 2017·48 cites

Convolutional neural networks pretrained on large face recognition datasets for emotion classification from video

Boris Knyazev, Roman Shvetsov, Natalia Efremova, Artem Kuharenko

PDF

Open Access

TL;DR

This paper presents an ensemble of CNNs pretrained on face recognition datasets for emotion classification from videos, achieving state-of-the-art accuracy without temporal information.

Contribution

It introduces the use of pretrained face recognition CNNs in an ensemble for emotion recognition, improving accuracy over previous methods.

Findings

01

Achieved 60.03% accuracy on EmotiW 2017 test set.

02

Ensemble of spatial and audio features enhances performance.

03

Pretraining on face recognition datasets boosts emotion classification accuracy.

Abstract

In this paper we describe a solution to our entry for the emotion recognition challenge EmotiW 2017. We propose an ensemble of several models, which capture spatial and audio features from videos. Spatial features are captured by convolutional neural networks, pretrained on large face recognition datasets. We show that usage of strong industry-level face recognition networks increases the accuracy of emotion recognition. Using our ensemble we improve on the previous best result on the test set by about 1 %, achieving a 60.03 % classification accuracy without any use of visual temporal information.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace recognition and analysis · Human Pose and Action Recognition · Video Surveillance and Tracking Methods