Open-Vocabulary Functional 3D Human-Scene Interaction Generation
Jie Liu, Yu Sun, Alpar Cseke, Yao Feng, Nicolas Heron, Michael J. Black, Yan Zhang

TL;DR
This paper introduces FunHSI, a framework that generates functionally correct 3D human-scene interactions from open-vocabulary prompts by reasoning about object functionality and contact, improving plausibility and diversity.
Contribution
FunHSI is a training-free, functionality-driven method that explicitly models object functions and contact reasoning to produce realistic 3D human-scene interactions from open prompts.
Findings
Generates more plausible 3D human-scene interactions than existing methods.
Supports fine-grained functional interactions like adjusting room temperature.
Works across diverse indoor and outdoor scenes.
Abstract
Generating 3D humans that functionally interact with 3D scenes remains an open problem with applications in embodied AI, robotics, and interactive content creation. The key challenge involves reasoning about both the semantics of functional elements in 3D scenes and the 3D human poses required to achieve functionality-aware interaction. Unfortunately, existing methods typically lack explicit reasoning over object functionality and the corresponding human-scene contact, resulting in implausible or functionally incorrect interactions. In this work, we propose FunHSI, a training-free, functionality-driven framework that enables functionally correct human-scene interactions from open-vocabulary task prompts. Given a task prompt, FunHSI performs functionality-aware contact reasoning to identify functional scene elements, reconstruct their 3D geometry, and model high-level interactions via a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMultimodal Machine Learning Applications · 3D Shape Modeling and Analysis · Human Motion and Animation
