Hierarchical Generation of Human-Object Interactions with Diffusion Probabilistic Models
Huaijin Pi, Sida Peng, Minghui Yang, Xiaowei Zhou, Hujun Bao

TL;DR
This paper introduces a hierarchical diffusion-based framework for generating diverse and long-range 3D human-object interaction motions by synthesizing milestones and short motion sequences, outperforming previous methods.
Contribution
It proposes a novel hierarchical diffusion model that effectively generates long-range, diverse human-object interaction motions by breaking down the task into milestone-based short sequences.
Findings
Outperforms previous methods in quality and diversity
Effective long-range motion synthesis through hierarchical approach
Validated on NSM, COUCH, and SAMP datasets
Abstract
This paper presents a novel approach to generating the 3D motion of a human interacting with a target object, with a focus on solving the challenge of synthesizing long-range and diverse motions, which could not be fulfilled by existing auto-regressive models or path planning-based methods. We propose a hierarchical generation framework to solve this challenge. Specifically, our framework first generates a set of milestones and then synthesizes the motion along them. Therefore, the long-range motion generation could be reduced to synthesizing several short motion sequences guided by milestones. The experiments on the NSM, COUCH, and SAMP datasets show that our approach outperforms previous methods by a large margin in both quality and diversity. The source code is available on our project page https://zju3dv.github.io/hghoi.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Pose and Action Recognition · Human Motion and Animation · Video Surveillance and Tracking Methods
MethodsFocus
