Reconstruction-guided attention improves the robustness and shape   processing of neural networks

Seoyoung Ahn; Hossein Adeli; Gregory J. Zelinsky

arXiv:2209.13620·cs.CV·February 9, 2023

Reconstruction-guided attention improves the robustness and shape processing of neural networks

Seoyoung Ahn, Hossein Adeli, Gregory J. Zelinsky

PDF

Open Access 1 Repo

TL;DR

This paper introduces a reconstruction-guided attention model that enhances neural network robustness and shape processing, especially under challenging image corruptions, by integrating top-down feedback for improved object recognition.

Contribution

It presents an iterative encoder-decoder network utilizing reconstruction-based feedback as attention, demonstrating superior robustness and interpretability in out-of-distribution digit recognition tasks.

Findings

01

Outperforms other models on MNIST-C with various corruptions

02

Shows robustness to blur, noise, and occlusion

03

Reveals roles of spatial and feature-based attention in recognition

Abstract

Many visual phenomena suggest that humans use top-down generative or reconstructive processes to create visual percepts (e.g., imagery, object completion, pareidolia), but little is known about the role reconstruction plays in robust object recognition. We built an iterative encoder-decoder network that generates an object reconstruction and used it as top-down attentional feedback to route the most relevant spatial and feature information to feed-forward object recognition processes. We tested this model using the challenging out-of-distribution digit recognition dataset, MNIST-C, where 15 different types of transformation and corruption are applied to handwritten digit images. Our model showed strong generalization performance against various image perturbations, on average outperforming all other models including feedforward CNNs and adversarially trained networks. Our model is…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ahnchive/recon-mnist
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Cell Image Analysis Techniques

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings