CG-HOI: Contact-Guided 3D Human-Object Interaction Generation
Christian Diller, Angela Dai

TL;DR
CG-HOI is a novel method that generates realistic 3D human-object interaction sequences from text by modeling contact as a key guidance factor, ensuring physical plausibility and coherence.
Contribution
It introduces the first approach to generate dynamic 3D HOIs from text using contact-guided joint diffusion modeling of human and object motions.
Findings
Produces realistic, physically plausible interaction sequences
Enables human motion generation conditioned on object trajectories without retraining
Applicable to static 3D scene scans
Abstract
We propose CG-HOI, the first method to address the task of generating dynamic 3D human-object interactions (HOIs) from text. We model the motion of both human and object in an interdependent fashion, as semantically rich human motion rarely happens in isolation without any interactions. Our key insight is that explicitly modeling contact between the human body surface and object geometry can be used as strong proxy guidance, both during training and inference. Using this guidance to bridge human and object motion enables generating more realistic and physically plausible interaction sequences, where the human body and corresponding object move in a coherent manner. Our method first learns to model human motion, object motion, and contact in a joint diffusion process, inter-correlated through cross-attention. We then leverage this learned contact for guidance during inference to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Multimodal Machine Learning Applications
MethodsDiffusion
