A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic   Search

Brandon Trabucco; Gunnar Sigurdsson; Robinson Piramuthu; Gaurav S.; Sukhatme; Ruslan Salakhutdinov

arXiv:2206.13396·cs.CV·August 11, 2022·5 cites

A Simple Approach for Visual Rearrangement: 3D Mapping and Semantic Search

Brandon Trabucco, Gunnar Sigurdsson, Robinson Piramuthu, Gaurav S., Sukhatme, Ruslan Salakhutdinov

PDF

Open Access

TL;DR

This paper introduces a straightforward method for visual room rearrangement using semantic mapping and search, significantly improving success rates and efficiency over previous reinforcement learning approaches.

Contribution

The paper presents a simple, effective approach combining semantic segmentation, voxel-based mapping, and search policy for visual rearrangement tasks, outperforming prior RL methods.

Findings

01

Improved success rate from 0.53% to 16.56%.

02

Reduced environment samples to 2.7%.

03

Effective use of semantic mapping for object rearrangement.

Abstract

Physically rearranging objects is an important capability for embodied agents. Visual room rearrangement evaluates an agent's ability to rearrange objects in a room to a desired goal based solely on visual input. We propose a simple yet effective method for this problem: (1) search for and map which objects need to be rearranged, and (2) rearrange each object until the task is complete. Our approach consists of an off-the-shelf semantic segmentation model, voxel-based semantic map, and semantic search policy to efficiently find objects that need to be rearranged. On the AI2-THOR Rearrangement Challenge, our method improves on current state-of-the-art end-to-end reinforcement learning-based methods that learn visual rearrangement policies from 0.53% correct rearrangement to 16.56%, using only 2.7% as many samples from the environment.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Cell Image Analysis Techniques · Advanced Neural Network Applications