TL;DR
This paper introduces a deformable convolutional LSTM model that enhances human body emotion recognition from videos by handling deformations like scaling and rotation, achieving state-of-the-art accuracy on the GEMEP dataset.
Contribution
It proposes integrating deformable convolutions into ConvLSTM to improve robustness and accuracy in emotion recognition from body expressions in videos.
Findings
Achieved 98.8% accuracy on GEMEP dataset.
Outperformed existing methods in whole body emotion recognition.
Demonstrated robustness to image deformations like scaling and rotation.
Abstract
People represent their emotions in a myriad of ways. Among the most important ones is whole body expressions which have many applications in different fields such as human-computer interaction (HCI). One of the most important challenges in human emotion recognition is that people express the same feeling in various ways using their face and their body. Recently many methods have tried to overcome these challenges using Deep Neural Networks (DNNs). However, most of these methods were based on images or on facial expressions only and did not consider deformation that may happen in the images such as scaling and rotation which can adversely affect the recognition accuracy. In this work, motivated by recent researches on deformable convolutions, we incorporate the deformable behavior into the core of convolutional long short-term memory (ConvLSTM) to improve robustness to these deformations…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
