Micro-AU CLIP: Fine-Grained Contrastive Learning from Local Independence to Global Dependency for Micro-Expression Action Unit Detection
Jinsheng Wei, Fengzhou Guo, Yante Li, Haoyu Chen, Guanming Lu, Guoying Zhao

TL;DR
Micro-AU CLIP introduces a novel framework that models local independence and global dependency of facial action units for improved micro-expression recognition, achieving state-of-the-art results.
Contribution
The paper proposes a new micro-AU detection framework combining local semantic independence and global dependency modeling, with a specialized contrastive loss for fine-grained feature learning.
Findings
Achieves state-of-the-art micro-AU detection performance
Effectively models local independence and global dependency of AUs
Enhances micro-expression recognition accuracy
Abstract
Micro-expression (ME) action units (Micro-AUs) provide objective clues for fine-grained genuine emotion analysis. Most existing Micro-AU detection methods learn AU features from the whole facial image/video, which conflicts with the inherent locality of AU, resulting in insufficient perception of AU regions. In fact, each AU independently corresponds to specific localized facial muscle movements (local independence), while there is an inherent dependency between some AUs under specific emotional states (global dependency). Thus, this paper explores the effectiveness of the independence-to-dependency pattern and proposes a novel micro-AU detection framework, micro-AU CLIP, that uniquely decomposes the AU detection process into local semantic independence modeling (LSI) and global semantic dependency (GSD) modeling. In LSI, Patch Token Attention (PTA) is designed, mapping several local…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEmotion and Mood Recognition · Face and Expression Recognition · Human Pose and Action Recognition
