Attend Before you Act: Leveraging human visual attention for continual   learning

Khimya Khetarpal; Doina Precup

arXiv:1807.09664·cs.AI·July 26, 2018·1 cites

Attend Before you Act: Leveraging human visual attention for continual learning

Khimya Khetarpal, Doina Precup

PDF

Open Access 1 Repo

TL;DR

This paper explores using human-like visual attention, via saliency maps, to improve continual learning in reinforcement learning agents navigating 3D environments, demonstrating enhanced transfer learning capabilities.

Contribution

It introduces a method to incorporate saliency-based foveated images into reinforcement learning, leveraging human visual attention cues to improve transfer learning in noisy environments.

Findings

01

Saliency-guided training improves transfer learning performance.

02

Foveated images help the agent focus on relevant visual information.

03

The approach enhances robustness to environmental noise.

Abstract

When humans perform a task, such as playing a game, they selectively pay attention to certain parts of the visual input, gathering relevant information and sequentially combining it to build a representation from the sensory data. In this work, we explore leveraging where humans look in an image as an implicit indication of what is salient for decision making. We build on top of the UNREAL architecture in DeepMind Lab's 3D navigation maze environment. We train the agent both with original images and foveated images, which were generated by overlaying the original images with saliency maps generated using a real-time spectral residual technique. We investigate the effectiveness of this approach in transfer learning by measuring performance in the context of noise in the environment.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kkhetarpal/unrealwithattention
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications