TriDi: Trilateral Diffusion of 3D Humans, Objects, and Interactions

Ilya A. Petrov; Riccardo Marin; Julian Chibane; Gerard Pons-Moll

arXiv:2412.06334·cs.CV·July 29, 2025

TriDi: Trilateral Diffusion of 3D Humans, Objects, and Interactions

Ilya A. Petrov, Riccardo Marin, Julian Chibane, Gerard Pons-Moll

PDF

Open Access

TL;DR

TriDi introduces a unified three-way diffusion model for 3D human-object interaction, capable of generating human, object, and interaction data simultaneously, surpassing prior one-way models in diversity and quality.

Contribution

It is the first model to unify bidirectional 3D human-object interaction modeling using a single diffusion process and transformer architecture.

Findings

01

Outperforms specialized baselines on GRAB and BEHAVE datasets.

02

Generates diverse and high-quality 3D human-object interaction samples.

03

Demonstrates applicability to scene population and generalization to unseen objects.

Abstract

Modeling 3D human-object interaction (HOI) is a problem of great interest for computer vision and a key enabler for virtual and mixed-reality applications. Existing methods work in a one-way direction: some recover plausible human interactions conditioned on a 3D object; others recover the object pose conditioned on a human pose. Instead, we provide the first unified model - TriDi which works in any direction. Concretely, we generate Human, Object, and Interaction modalities simultaneously with a new three-way diffusion process, allowing to model seven distributions with one network. We implement TriDi as a transformer attending to the various modalities' tokens, thereby discovering conditional relations between them. The user can control the interaction either as a text description of HOI or a contact map. We embed these two representations into a shared latent space, combining the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

Topics3D Shape Modeling and Analysis

MethodsDiffusion