The human visual system and CNNs can both support robust online   translation tolerance following extreme displacements

Ryan Blything; Valerio Biscione; Ivan I. Vankov; Casimir J.H. Ludwig,; and Jeffrey S. Bowers

arXiv:2009.12855·q-bio.NC·December 9, 2020

The human visual system and CNNs can both support robust online translation tolerance following extreme displacements

Ryan Blything, Valerio Biscione, Ivan I. Vankov, Casimir J.H. Ludwig,, and Jeffrey S. Bowers

PDF

Open Access

TL;DR

This study shows that both humans and CNNs can recognize objects across large retinal displacements, with humans maintaining high accuracy up to 18 degrees and CNNs supporting this through training and architectural modifications.

Contribution

The paper demonstrates that humans exhibit significant translation tolerance and that CNNs can be trained or modified to replicate this robustness, clarifying previous inconsistencies.

Findings

01

Humans recognize objects at translations up to 18 degrees.

02

CNNs support translation tolerance with training and architectural changes.

03

Pretrained CNNs with GAP layers show increased receptive fields.

Abstract

Visual translation tolerance refers to our capacity to recognize objects over a wide range of different retinal locations. Although translation is perhaps the simplest spatial transform that the visual system needs to cope with, the extent to which the human visual system can identify objects at previously unseen locations is unclear, with some studies reporting near complete invariance over 10{\deg} and other reporting zero invariance at 4{\deg} of visual angle. Similarly, there is confusion regarding the extent of translation tolerance in computational models of vision, as well as the degree of match between human and model performance. Here we report a series of eye-tracking studies (total N=70) demonstrating that novel objects trained at one retinal location can be recognized at high accuracy rates following translations up to 18{\deg}. We also show that standard deep convolutional…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsRetinal Imaging and Analysis · Visual Attention and Saliency Detection · Visual perception and processing mechanisms

MethodsGlobal Average Pooling · Average Pooling