iMiGUE-3K: A Large-Scale Benchmark for Micro-Gesture Analysis with Self-Supervised Learning

Chengyan Wang; Haoyu Chen; Hui Wei; Yueyi Yang; Yunquan Chen; and Guoying Zhao

arXiv:2605.17179·cs.CV·May 19, 2026

iMiGUE-3K: A Large-Scale Benchmark for Micro-Gesture Analysis with Self-Supervised Learning

Chengyan Wang, Haoyu Chen, Hui Wei, Yueyi Yang, Yunquan Chen, and Guoying Zhao

PDF

TL;DR

This paper introduces iMiGUE-3K, the largest large-scale micro-gesture dataset for emotion understanding, along with foundation models and evaluation tasks, advancing research in affective computing and human-computer interaction.

Contribution

It presents a novel large-scale dataset (iMiGUE-3K), a series of foundation models for micro-gesture analysis, and comprehensive evaluation tasks to enhance emotion understanding research.

Findings

01

Micro-gesture analysis improves emotion recognition accuracy.

02

iMiGUE-3K dataset contains over 3.4K video clips and 37 million frames.

03

Proposed models outperform existing methods in gesture-based emotion tasks.

Abstract

Emotion understanding is a fundamental challenge in affective computing and artificial intelligence. While existing approaches predominantly focus on facial expressions and speech, they often overlook the rich emotional cues conveyed through body language. Recently, micro-gestures (MGs), unintentional, subconscious movements driven by inner feelings, have attracted increasing attention as an alternative to other cues. However, there are no existing large-scale datasets supporting the pre-training of the MG foundation model. To advance MG research, we present a new benchmark for micro-gesture-based emotion understanding, featuring key contributions with a novel dataset (iMiGUE-3K) and a series of foundation models for different tasks. Using a model-based crowd-sourcing data collection strategy, we construct iMiGUE-3K, the largest MG dataset to date. It comprises video recordings from 332…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.