AULLM++: Structural Reasoning with Large Language Models for Micro-Expression Recognition

Zhishu Liu; Kaishen Yuan; Bo Zhao; Hui Ma; and Zitong Yu

arXiv:2603.08387·cs.CV·March 10, 2026

AULLM++: Structural Reasoning with Large Language Models for Micro-Expression Recognition

Zhishu Liu, Kaishen Yuan, Bo Zhao, Hui Ma, and Zitong Yu

PDF

Open Access

TL;DR

AULLM++ introduces a reasoning framework using Large Language Models for micro-expression AU detection, addressing previous limitations by integrating visual features, modeling AU relationships, and enhancing generalization.

Contribution

The paper presents a novel reasoning-oriented approach that combines visual feature fusion, structural AU modeling, and counterfactual regularization within LLM prompts for improved micro-expression recognition.

Findings

01

Achieves state-of-the-art performance on benchmark datasets.

02

Demonstrates superior cross-domain generalization.

03

Effectively models inter-AU relationships with a graph neural network.

Abstract

Micro-expression Action Unit (AU) detection identifies localized AUs from subtle facial muscle activations, providing a foundation for decoding affective cues. Previous methods face three key limitations: (1) heavy reliance on low-density visual information, rendering discriminative evidence vulnerable to background noise; (2) coarse-grained feature processing that misaligns with the demand for fine-grained representations; and (3) neglect of inter-AU correlations, restricting the parsing of complex expression patterns. We propose AULLM++, a reasoning-oriented framework leveraging Large Language Models (LLMs), which injects visual features into textual prompts as actionable semantic premises to guide inference. It formulates AU prediction into three stages: evidence construction, structure modeling, and deduction-based prediction. Specifically, a Multi-Granularity Evidence-Enhanced…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsEmotion and Mood Recognition · Multimodal Machine Learning Applications · Face Recognition and Perception