Unsupervised Foveal Vision Neural Networks with Top-Down Attention
Ryan Burt, Nina N. Thigpen, Andreas Keil, Jose C. Principe

TL;DR
This paper introduces an unsupervised foveal vision neural network that combines bottom-up saliency and top-down attention, improving object recognition speed and accuracy, and outperforming humans in disambiguating objects from backgrounds.
Contribution
It presents a novel unsupervised architecture integrating Gamma saliency and top-down attention, enhancing scene understanding and CNN performance without supervised training.
Findings
Gamma saliency is computationally faster and comparable to the best methods.
Unsupervised architecture performs on par with supervised methods in SVHN.
Top-down attention improves object-background disambiguation beyond human performance.
Abstract
Deep learning architectures are an extremely powerful tool for recognizing and classifying images. However, they require supervised learning and normally work on vectors the size of image pixels and produce the best results when trained on millions of object images. To help mitigate these issues, we propose the fusion of bottom-up saliency and top-down attention employing only unsupervised learning techniques, which helps the object recognition module to focus on relevant data and learn important features that can later be fine-tuned for a specific task. In addition, by utilizing only relevant portions of the data, the training speed can be greatly improved. We test the performance of the proposed Gamma saliency technique on the Toronto and CAT2000 databases, and the foveated vision in the Street View House Numbers (SVHN) database. The results in foveated vision show that Gamma saliency…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Advanced Image and Video Retrieval Techniques
