Visual Navigation with Spatial Attention

Bar Mayo; Tamir Hazan; Ayellet Tal

arXiv:2104.09807·cs.CV·April 21, 2021

Visual Navigation with Spatial Attention

Bar Mayo, Tamir Hazan, Ayellet Tal

PDF

1 Repo

TL;DR

This paper introduces a novel attention-based reinforcement learning approach for object goal visual navigation, combining semantic and spatial information to improve navigation accuracy and achieve state-of-the-art results.

Contribution

The main contribution is a new attention probability model that encodes both semantic and spatial information, enhancing the agent's navigation policy.

Findings

01

The attention model improves navigation performance.

02

Achieves state-of-the-art results on standard datasets.

03

Enhances the agent's ability to locate objects efficiently.

Abstract

This work focuses on object goal visual navigation, aiming at finding the location of an object from a given class, where in each step the agent is provided with an egocentric RGB image of the scene. We propose to learn the agent's policy using a reinforcement learning algorithm. Our key contribution is a novel attention probability model for visual navigation tasks. This attention encodes semantic information about observed objects, as well as spatial information about their place. This combination of the "what" and the "where" allows the agent to navigate toward the sought-after object effectively. The attention model is shown to improve the agent's policy and to achieve state-of-the-art results on commonly-used datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

barmayo/spatial_attention
pytorch

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.