TL;DR
This paper introduces MELLM, a novel large language model integrated with optical flow analysis for comprehensive micro-expression understanding, surpassing existing methods in accuracy and interpretability.
Contribution
The work presents MEFlowNet, the first dedicated micro-expression flow estimator, and MELLM, the first LLM designed specifically for micro-expression understanding.
Findings
MEFlowNet outperforms existing optical flow methods in facial motion estimation.
MELLM achieves state-of-the-art accuracy on multiple micro-expression benchmarks.
The approach enables human-readable descriptions and emotional inferences from micro-movements.
Abstract
Micro-expressions (MEs), brief and low-intensity facial movements revealing concealed emotions, are crucial for affective computing. Despite notable progress in ME recognition, existing methods are largely confined to discrete emotion classification, lacking the capacity for comprehensive ME Understanding (MEU), particularly in interpreting subtle facial dynamics and underlying emotional cues. While Multimodal Large Language Models (MLLMs) offer potential for MEU with their advanced reasoning abilities, they still struggle to perceive such subtle facial affective behaviors. To bridge this gap, we propose a ME Large Language Model (MELLM) that integrates optical flow-based sensitivity to subtle facial motions with the powerful inference ability of LLMs. Specifically, an iterative, warping-based optical-flow estimator, named MEFlowNet, is introduced to precisely capture facial…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
