OOD-HOI: Text-Driven 3D Whole-Body Human-Object Interactions Generation   Beyond Training Domains

Yixuan Zhang; Hui Yang; Chuanchen Luo; Junran Peng; Yuxi Wang,; Zhaoxiang Zhang

arXiv:2411.18660·cs.CV·December 2, 2024

OOD-HOI: Text-Driven 3D Whole-Body Human-Object Interactions Generation Beyond Training Domains

Yixuan Zhang, Hui Yang, Chuanchen Luo, Junran Peng, Yuxi Wang,, Zhaoxiang Zhang

PDF

Open Access

TL;DR

This paper introduces OOD-HOI, a novel text-driven framework that generates realistic 3D whole-body human-object interactions capable of generalizing to new objects and actions, addressing data scarcity and physical plausibility challenges.

Contribution

The paper proposes a dual-branch diffusion model with contact-guided refinement and dynamic adaptation for robust, out-of-domain 3D human-object interaction generation from text.

Findings

01

Outperforms existing methods in realism and physical plausibility

02

Effective in out-of-domain scenarios with new objects and actions

03

Demonstrates robustness and generalization in 3D interaction synthesis

Abstract

Generating realistic 3D human-object interactions (HOIs) from text descriptions is a active research topic with potential applications in virtual and augmented reality, robotics, and animation. However, creating high-quality 3D HOIs remains challenging due to the lack of large-scale interaction data and the difficulty of ensuring physical plausibility, especially in out-of-domain (OOD) scenarios. Current methods tend to focus either on the body or the hands, which limits their ability to produce cohesive and realistic interactions. In this paper, we propose OOD-HOI, a text-driven framework for generating whole-body human-object interactions that generalize well to new objects and actions. Our approach integrates a dual-branch reciprocal diffusion model to synthesize initial interaction poses, a contact-guided interaction refiner to improve physical accuracy based on predicted contact…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · Hand Gesture Recognition Systems

MethodsFocus · Diffusion