Hybrid-supervised Hypergraph-enhanced Transformer for Micro-gesture Based Emotion Recognition
Zhaoqiang Xia, Hexiang Huang, Haoyu Chen, Xiaoyi Feng, and Guoying Zhao

TL;DR
This paper introduces a hybrid-supervised hypergraph-enhanced Transformer model for recognizing human emotions from micro-gestures, effectively capturing subtle motions and relationships between body joints.
Contribution
It proposes a novel hypergraph-enhanced Transformer framework with self-supervised reconstruction and supervised emotion recognition, improving micro-gesture based emotion understanding.
Findings
Achieves state-of-the-art performance on iMiGUE and SMG datasets.
Effectively models subtle micro-gesture motions and joint relationships.
Outperforms existing methods in emotion recognition accuracy.
Abstract
Micro-gestures are unconsciously performed body gestures that can convey the emotion states of humans and start to attract more research attention in the fields of human behavior understanding and affective computing as an emerging topic. However, the modeling of human emotion based on micro-gestures has not been explored sufficiently. In this work, we propose to recognize the emotion states based on the micro-gestures by reconstructing the behavior patterns with a hypergraph-enhanced Transformer in a hybrid-supervised framework. In the framework, hypergraph Transformer based encoder and decoder are separately designed by stacking the hypergraph-enhanced self-attention and multiscale temporal convolution modules. Especially, to better capture the subtle motion of micro-gestures, we construct a decoder with additional upsampling operations for a reconstruction task in a self-supervised…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Human Pose and Action Recognition · Sentiment Analysis and Opinion Mining
