TL;DR
This paper presents HA-HOI, a novel framework that reconstructs physically plausible 4D human-object interactions from monocular videos, enabling stable, contact-aware, and simulation-ready animations.
Contribution
It introduces a human-first, object-follow approach that aligns reconstructed trajectories with physics-based simulation, improving realism and stability over prior methods.
Findings
Enhances human-object alignment and contact consistency.
Improves temporal stability of reconstructed interactions.
Enables simulation-ready 4D HOI animations.
Abstract
Recovering 4D human-object interaction (HOI) from monocular video is a key step toward scalable 3D content creation, embodied AI, and simulation-based learning. Recent methods can reconstruct temporally coherent human and object trajectories, but these trajectories often remain visual artifacts while failing to preserve stable contact, functional manipulation, or physical plausibility when used as reference motions for humanoid-object simulation. This reveals a fundamental interaction gap: HOI reconstruction should not stop at tracking a human and an object, but should recover the relation that makes their motion a coherent interaction. We introduce , a framework for reconstructing physically plausible 4D HOI animation from in-the-wild monocular videos. Instead of treating the human and object as independent entities in an ambiguous monocular 3D space, we propose a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
