Visual Attention in Imaginative Agents

Samrudhdhi B. Rangrej; James J. Clark

arXiv:2104.00177·cs.CV·April 2, 2021·1 cites

Visual Attention in Imaginative Agents

Samrudhdhi B. Rangrej, James J. Clark

PDF

Open Access

TL;DR

This paper introduces a recurrent agent that uses visual attention and imagination to explore environments, planning fixations based on uncertainty, and improving scene understanding through unsupervised learning.

Contribution

It presents a novel agent architecture combining variational autoencoders and normalizing flows for unsupervised scene imagination and fixation planning.

Findings

01

Agent reduces uncertainty over time

02

Imagined scene representations aid downstream tasks

03

Effective on 2D and 3D datasets

Abstract

We present a recurrent agent who perceives surroundings through a series of discrete fixations. At each timestep, the agent imagines a variety of plausible scenes consistent with the fixation history. The next fixation is planned using uncertainty in the content of the imagined scenes. As time progresses, the agent becomes more certain about the content of the surrounding, and the variety in the imagined scenes reduces. The agent is built using a variational autoencoder and normalizing flows, and trained in an unsupervised manner on a proxy task of scene-reconstruction. The latent representations of the imagined scenes are found to be useful for performing pixel-level and scene-level tasks by higher-order modules. The agent is tested on various 2D and 3D datasets.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Visual Attention and Saliency Detection · Advanced Image and Video Retrieval Techniques

MethodsSolana Customer Service Number +1-833-534-1729