InterFusion: Text-Driven Generation of 3D Human-Object Interaction
Sisi Dai, Wenhao Li, Haowen Sun, Haibin Huang, Chongyang Ma, Hui, Huang, Kai Xu, Ruizhen Hu

TL;DR
InterFusion is a novel two-stage framework that enables zero-shot text-to-3D human-object interaction generation by leveraging human pose priors and a local-global optimization process, significantly improving over previous methods.
Contribution
We introduce InterFusion, a two-stage approach that addresses data scarcity and complex spatial relationships in 3D HOI generation from text, advancing the state-of-the-art.
Findings
Outperforms existing methods in 3D HOI generation
Produces realistic and high-quality 3D scenes
Effectively handles complex spatial relationships
Abstract
In this study, we tackle the complex task of generating 3D human-object interactions (HOI) from textual descriptions in a zero-shot text-to-3D manner. We identify and address two key challenges: the unsatisfactory outcomes of direct text-to-3D methods in HOI, largely due to the lack of paired text-interaction data, and the inherent difficulties in simultaneously generating multiple concepts with complex spatial relationships. To effectively address these issues, we present InterFusion, a two-stage framework specifically designed for HOI generation. InterFusion involves human pose estimations derived from text as geometric priors, which simplifies the text-to-3D conversion process and introduces additional constraints for accurate object generation. At the first stage, InterFusion extracts 3D human poses from a synthesized image dataset depicting a wide range of interactions,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHuman Motion and Animation · Speech and dialogue systems · Natural Language Processing Techniques
