Foveation in the Era of Deep Learning

George Killick; Paul Henderson; Paul Siebert; Gerardo; Aragon-Camarasa

arXiv:2312.01450·cs.CV·December 5, 2023·1 cites

Foveation in the Era of Deep Learning

George Killick, Paul Henderson, Paul Siebert, Gerardo, Aragon-Camarasa

PDF

Open Access 1 Repo

TL;DR

This paper presents a differentiable foveated active vision system using graph convolutional networks that learns to attend to relevant image regions, improving object recognition performance over previous methods.

Contribution

Introduces an end-to-end trainable foveated vision architecture with a novel sampling method and graph-based processing, advancing active visual attention models.

Findings

01

Outperforms state-of-the-art CNNs in foveated vision tasks

02

Effectively learns to attend to relevant image regions

03

Improves object recognition accuracy with fewer resources

Abstract

In this paper, we tackle the challenge of actively attending to visual scenes using a foveated sensor. We introduce an end-to-end differentiable foveated active vision architecture that leverages a graph convolutional network to process foveated images, and a simple yet effective formulation for foveated image sampling. Our model learns to iteratively attend to regions of the image relevant for classification. We conduct detailed experiments on a variety of image datasets, comparing the performance of our method with previous approaches to foveated vision while measuring how the impact of different choices, such as the degree of foveation, and the number of fixations the network performs, affect object recognition performance. We find that our model outperforms a state-of-the-art CNN and foveated vision architectures of comparable parameters and a given pixel or computation budget

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

georgekillick90/fovconvnext
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · CCD and CMOS Imaging Sensors · Advanced Image and Video Retrieval Techniques