Scene Graph for Embodied Exploration in Cluttered Scenario

Yuhong Deng; Qie Sima; Di Guo; Huaping Liu; Yi Wang; Fuchun Sun

arXiv:2207.07870·cs.RO·October 17, 2023·1 cites

Scene Graph for Embodied Exploration in Cluttered Scenario

Yuhong Deng, Qie Sima, Di Guo, Huaping Liu, Yi Wang, Fuchun Sun

PDF

Open Access

TL;DR

This paper introduces a scene graph-based framework for embodied exploration in cluttered environments, enabling robots to understand and manipulate objects through active exploration and semantic reasoning, validated on manipulation question answering tasks.

Contribution

It presents a novel scene graph approach combined with imitation learning and VQA models for semantic understanding in cluttered scenarios, addressing a gap in robotic exploration and manipulation.

Findings

01

Effective in MQA tasks with cluttered environments

02

Demonstrates improved semantic understanding during exploration

03

Validates the approach's applicability to real-world robotic tasks

Abstract

The ability to handle objects in cluttered environment has been long anticipated by robotic community. However, most of works merely focus on manipulation instead of rendering hidden semantic information in cluttered objects. In this work, we introduce the scene graph for embodied exploration in cluttered scenarios to solve this problem. To validate our method in cluttered scenario, we adopt the Manipulation Question Answering (MQA) tasks as our test benchmark, which requires an embodied robot to have the active exploration ability and semantic understanding ability of vision and language.As a general solution framework to the task, we propose an imitation learning method to generate manipulations for exploration. Meanwhile, a VQA model based on dynamic scene graph is adopted to comprehend a series of RGB frames from wrist camera of manipulator along with every step of manipulation is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMultimodal Machine Learning Applications · Human Pose and Action Recognition · Domain Adaptation and Few-Shot Learning