HTNet for micro-expression recognition
Zhifeng Wang, Kaihao Zhang, Wenhan Luo, Ramesh, Sankaranarayana

TL;DR
This paper introduces HTNet, a hierarchical transformer network that effectively captures local and global facial muscle movements for improved micro-expression recognition.
Contribution
The paper proposes a novel HTNet architecture that models spatial relationships between facial regions and local muscle movements, enhancing micro-expression recognition performance.
Findings
Outperforms previous methods on four datasets
Effectively captures local muscle movements with transformer layers
Learns interactions between facial regions
Abstract
Facial expression is related to facial muscle contractions and different muscle movements correspond to different emotional states. For micro-expression recognition, the muscle movements are usually subtle, which has a negative impact on the performance of current facial emotion recognition algorithms. Most existing methods use self-attention mechanisms to capture relationships between tokens in a sequence, but they do not take into account the inherent spatial relationships between facial landmarks. This can result in sub-optimal performance on micro-expression recognition tasks.Therefore, learning to recognize facial muscle movements is a key challenge in the area of micro-expression recognition. In this paper, we propose a Hierarchical Transformer Network (HTNet) to identify critical areas of facial muscle movement. HTNet includes two major components: a transformer layer that…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Advanced Computing and Algorithms · Facial Nerve Paralysis Treatment and Research
MethodsMulti-Head Attention · Attention Is All You Need · Byte Pair Encoding · Linear Layer · Softmax · Layer Normalization · Dense Connections · Dropout · Focus · Position-Wise Feed-Forward Layer
