Ditto in the House: Building Articulation Models of Indoor Scenes   through Interactive Perception

Cheng-Chun Hsu; Zhenyu Jiang; Yuke Zhu

arXiv:2302.01295·cs.RO·February 3, 2023

Ditto in the House: Building Articulation Models of Indoor Scenes through Interactive Perception

Cheng-Chun Hsu, Zhenyu Jiang, Yuke Zhu

PDF

Open Access

TL;DR

This paper presents Ditto in the House, an interactive perception system enabling a robot to discover, manipulate, and infer articulation models of indoor objects, facilitating room-scale scene understanding for robotic manipulation.

Contribution

It introduces a novel interactive perception approach that combines affordance prediction and articulation inference for large-scale indoor scene modeling.

Findings

01

Effective in simulation and real-world scenes

02

Improves articulation reasoning through interaction

03

Enables robot exploration of complex indoor environments

Abstract

Virtualizing the physical world into virtual models has been a critical technique for robot navigation and planning in the real world. To foster manipulation with articulated objects in everyday life, this work explores building articulation models of indoor scenes through a robot's purposeful interactions in these scenes. Prior work on articulation reasoning primarily focuses on siloed objects of limited categories. To extend to room-scale environments, the robot has to efficiently and effectively explore a large-scale 3D space, locate articulated objects, and infer their articulations. We introduce an interactive perception approach to this task. Our approach, named Ditto in the House, discovers possible articulated objects through affordance prediction, interacts with these objects to produce articulated motions, and infers the articulation properties from the visual observations…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsHuman Pose and Action Recognition · Multimodal Machine Learning Applications · Human Motion and Animation