Facial Emotion Recognition using Deep Residual Networks in Real-World Environments
Panagiotis Tzirakis, D\'enes Boros, Elnar Hajiyev, Bj\"orn W. Schuller

TL;DR
This paper introduces a deep residual network-based facial feature extractor trained on a large in-the-wild video dataset, leveraging LSTM to capture temporal dynamics for improved emotion recognition in real-world environments.
Contribution
It presents a novel facial feature extractor trained on a massive real-world dataset, incorporating temporal modeling with LSTM for enhanced affect recognition performance.
Findings
Achieved state-of-the-art results on the RECOLA database.
Demonstrated the effectiveness of large-scale in-the-wild training data.
Validated the model's superior performance in real-world emotion recognition.
Abstract
Automatic affect recognition using visual cues is an important task towards a complete interaction between humans and machines. Applications can be found in tutoring systems and human computer interaction. A critical step towards that direction is facial feature extraction. In this paper, we propose a facial feature extractor model trained on an in-the-wild and massively collected video dataset provided by the RealEyes company. The dataset consists of a million labelled frames and 2,616 thousand subjects. As temporal information is important to the emotion recognition domain, we utilise LSTM cells to capture the temporal dynamics in the data. To show the favourable properties of our pre-trained model on modelling facial affect, we use the RECOLA database, and compare with the current state-of-the-art approach. Our model provides the best results in terms of concordance correlation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Face recognition and analysis · Video Surveillance and Tracking Methods
MethodsTanh Activation · Sigmoid Activation · Long Short-Term Memory
