Attend Before you Act: Leveraging human visual attention for continual learning
Khimya Khetarpal, Doina Precup

TL;DR
This paper explores using human-like visual attention, via saliency maps, to improve continual learning in reinforcement learning agents navigating 3D environments, demonstrating enhanced transfer learning capabilities.
Contribution
It introduces a method to incorporate saliency-based foveated images into reinforcement learning, leveraging human visual attention cues to improve transfer learning in noisy environments.
Findings
Saliency-guided training improves transfer learning performance.
Foveated images help the agent focus on relevant visual information.
The approach enhances robustness to environmental noise.
Abstract
When humans perform a task, such as playing a game, they selectively pay attention to certain parts of the visual input, gathering relevant information and sequentially combining it to build a representation from the sensory data. In this work, we explore leveraging where humans look in an image as an implicit indication of what is salient for decision making. We build on top of the UNREAL architecture in DeepMind Lab's 3D navigation maze environment. We train the agent both with original images and foveated images, which were generated by overlaying the original images with saliency maps generated using a real-time spectral residual technique. We investigate the effectiveness of this approach in transfer learning by measuring performance in the context of noise in the environment.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Domain Adaptation and Few-Shot Learning · Multimodal Machine Learning Applications
