Human-in-Context: Unified Cross-Domain 3D Human Motion Modeling via In-Context Learning

Mengyuan Liu; Xinshun Wang; Zhongbin Fang; Deheng Ye; Xia Li; Tao Tang; Songtao Wu; Xiangtai Li; Ming-Hsuan Yang

arXiv:2508.10897·cs.CV·August 15, 2025

Human-in-Context: Unified Cross-Domain 3D Human Motion Modeling via In-Context Learning

Mengyuan Liu, Xinshun Wang, Zhongbin Fang, Deheng Ye, Xia Li, Tao Tang, Songtao Wu, Xiangtai Li, Ming-Hsuan Yang

PDF

TL;DR

This paper introduces Human-in-Context, a unified model for 3D human motion across multiple domains, tasks, and modalities, using in-context learning and novel strategies to improve generalization and scalability.

Contribution

It proposes a new unified framework, Human-in-Context, that eliminates domain-specific components and multi-stage training for cross-domain 3D human motion modeling.

Findings

01

HiC outperforms PiC in generalization and data scale.

02

The model effectively handles multiple modalities and tasks.

03

Experimental results demonstrate improved performance across diverse domains.

Abstract

This paper aims to model 3D human motion across domains, where a single model is expected to handle multiple modalities, tasks, and datasets. Existing cross-domain models often rely on domain-specific components and multi-stage training, which limits their practicality and scalability. To overcome these challenges, we propose a new setting to train a unified cross-domain model through a single process, eliminating the need for domain-specific components and multi-stage training. We first introduce Pose-in-Context (PiC), which leverages in-context learning to create a pose-centric cross-domain model. While PiC generalizes across multiple pose-based tasks and datasets, it encounters difficulties with modality diversity, prompting strategy, and contextual dependency handling. We thus propose Human-in-Context (HiC), an extension of PiC that broadens generalization across modalities, tasks,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.