DAViD: Modeling Dynamic Affordance of 3D Objects Using Pre-trained Video Diffusion Models
Hyeonwoo Kim, Sangwon Baik, Hanbyul Joo

TL;DR
This paper introduces DAViD, a novel framework that models dynamic human-object interactions in 3D by generating synthetic 4D samples from 2D videos using pre-trained diffusion models, enabling better understanding and synthesis of motion patterns.
Contribution
The paper presents a new pipeline for learning 4D human-object interaction models using synthetic data and introduces a LoRA-enhanced diffusion model for capturing dynamic affordance in 3D objects.
Findings
DAViD outperforms baselines in HOI motion synthesis.
The pipeline effectively integrates new HOI concepts with pre-trained motions.
Synthetic 4D samples enable learning from limited data.
Abstract
Modeling how humans interact with objects is crucial for AI to effectively assist or mimic human behaviors. Existing studies for learning such ability primarily focus on static human-object interaction (HOI) patterns, such as contact and spatial relationships, while dynamic HOI patterns, capturing the movement of humans and objects over time, remain relatively underexplored. In this paper, we present a novel framework for learning Dynamic Affordance across various target object categories. To address the scarcity of 4D HOI datasets, our method learns the 3D dynamic affordance from synthetically generated 4D HOI samples. Specifically, we propose a pipeline that first generates 2D HOI videos from a given 3D target object using a pre-trained video diffusion model, then lifts them into 3D to generate 4D HOI samples. Leveraging these synthesized 4D HOI samples, we train DAViD, our generative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
Topics3D Shape Modeling and Analysis
MethodsDiffusion · Focus
