DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with   Diffusion Priors

Thomas Hanwen Zhu; Ruining Li; Tomas Jakab

arXiv:2409.08278·cs.CV·September 13, 2024

DreamHOI: Subject-Driven Generation of 3D Human-Object Interactions with Diffusion Priors

Thomas Hanwen Zhu, Ruining Li, Tomas Jakab

PDF

Open Access

TL;DR

DreamHOI is a zero-shot method that synthesizes realistic 3D human-object interactions from text descriptions by combining diffusion models with a dual implicit-explicit mesh representation.

Contribution

It introduces a novel dual implicit-explicit representation and a gradient optimization technique for realistic 3D HOI generation without extensive datasets.

Findings

01

Effective zero-shot synthesis of HOIs from text

02

Realistic 3D interactions generated with high fidelity

03

Outperforms existing methods in quality and diversity

Abstract

We present DreamHOI, a novel method for zero-shot synthesis of human-object interactions (HOIs), enabling a 3D human model to realistically interact with any given object based on a textual description. This task is complicated by the varying categories and geometries of real-world objects and the scarcity of datasets encompassing diverse HOIs. To circumvent the need for extensive data, we leverage text-to-image diffusion models trained on billions of image-caption pairs. We optimize the articulation of a skinned human mesh using Score Distillation Sampling (SDS) gradients obtained from these models, which predict image-space edits. However, directly backpropagating image-space gradients into complex articulation parameters is ineffective due to the local nature of such gradients. To overcome this, we introduce a dual implicit-explicit representation of a skinned mesh, combining…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Human Motion and Animation · 3D Shape Modeling and Analysis

MethodsDiffusion