AU-LLM: Micro-Expression Action Unit Detection via Enhanced LLM-Based Feature Fusion

Zhishu Liu; Kaishen Yuan; Bo Zhao; Yong Xu; Zitong Yu

arXiv:2507.21778·cs.CV·July 30, 2025

AU-LLM: Micro-Expression Action Unit Detection via Enhanced LLM-Based Feature Fusion

Zhishu Liu, Kaishen Yuan, Bo Zhao, Yong Xu, Zitong Yu

PDF

TL;DR

This paper introduces AU-LLM, a novel framework that leverages Large Language Models with an enhanced feature fusion technique to detect subtle micro-expression Action Units, achieving state-of-the-art results on benchmark datasets.

Contribution

It pioneers the use of LLMs for micro-expression AU detection and proposes the Enhanced Fusion Projector for effective vision-language feature integration.

Findings

01

Achieves state-of-the-art performance on CASME II and SAMM datasets.

02

Demonstrates robustness across LOSO and cross-domain protocols.

03

Validates the effectiveness of LLM-based reasoning in subtle facial expression analysis.

Abstract

The detection of micro-expression Action Units (AUs) is a formidable challenge in affective computing, pivotal for decoding subtle, involuntary human emotions. While Large Language Models (LLMs) demonstrate profound reasoning abilities, their application to the fine-grained, low-intensity domain of micro-expression AU detection remains unexplored. This paper pioneers this direction by introducing \textbf{AU-LLM}, a novel framework that for the first time uses LLM to detect AUs in micro-expression datasets with subtle intensities and the scarcity of data. We specifically address the critical vision-language semantic gap, the \textbf{Enhanced Fusion Projector (EFP)}. The EFP employs a Multi-Layer Perceptron (MLP) to intelligently fuse mid-level (local texture) and high-level (global semantics) visual features from a specialized 3D-CNN backbone into a single, information-dense token. This…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.