Unsupervised Foveal Vision Neural Networks with Top-Down Attention

Ryan Burt; Nina N. Thigpen; Andreas Keil; Jose C. Principe

arXiv:2010.09103·cs.LG·October 20, 2020

Unsupervised Foveal Vision Neural Networks with Top-Down Attention

Ryan Burt, Nina N. Thigpen, Andreas Keil, Jose C. Principe

PDF

Open Access

TL;DR

This paper introduces an unsupervised foveal vision neural network that combines bottom-up saliency and top-down attention, improving object recognition speed and accuracy, and outperforming humans in disambiguating objects from backgrounds.

Contribution

It presents a novel unsupervised architecture integrating Gamma saliency and top-down attention, enhancing scene understanding and CNN performance without supervised training.

Findings

01

Gamma saliency is computationally faster and comparable to the best methods.

02

Unsupervised architecture performs on par with supervised methods in SVHN.

03

Top-down attention improves object-background disambiguation beyond human performance.

Abstract

Deep learning architectures are an extremely powerful tool for recognizing and classifying images. However, they require supervised learning and normally work on vectors the size of image pixels and produce the best results when trained on millions of object images. To help mitigate these issues, we propose the fusion of bottom-up saliency and top-down attention employing only unsupervised learning techniques, which helps the object recognition module to focus on relevant data and learn important features that can later be fine-tuned for a specific task. In addition, by utilizing only relevant portions of the data, the training speed can be greatly improved. We test the performance of the proposed Gamma saliency technique on the Toronto and CAT2000 databases, and the foveated vision in the Street View House Numbers (SVHN) database. The results in foveated vision show that Gamma saliency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques