HuMoCon: Concept Discovery for Human Motion Understanding
Qihang Fang, Chengcheng Tang, Bugra Tekin, Shugao Ma, Yanchao Yang

TL;DR
HuMoCon is a new framework for human motion understanding that discovers motion concepts and improves feature extraction, leading to better performance in behavior analysis tasks.
Contribution
HuMoCon introduces a novel multi-modal motion concept discovery framework with feature alignment and velocity reconstruction to enhance human motion understanding.
Findings
Outperforms state-of-the-art methods on standard benchmarks.
Effectively discovers and models human motion concepts.
Enhances high-frequency feature representation and temporal modeling.
Abstract
We present HuMoCon, a novel motion-video understanding framework designed for advanced human behavior analysis. The core of our method is a human motion concept discovery framework that efficiently trains multi-modal encoders to extract semantically meaningful and generalizable features. HuMoCon addresses key challenges in motion concept discovery for understanding and reasoning, including the lack of explicit multi-modality feature alignment and the loss of high-frequency information in masked autoencoding frameworks. Our approach integrates a feature alignment strategy that leverages video for contextual understanding and motion for fine-grained interaction modeling, further with a velocity reconstruction mechanism to enhance high-frequency feature expression and mitigate temporal over-smoothing. Comprehensive experiments on standard benchmarks demonstrate that HuMoCon enables…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsData Management and Algorithms · Video Analysis and Summarization · Time Series Analysis and Forecasting
