Near-Optimal Glimpse Sequences for Improved Hard Attention Neural   Network Training

William Harvey; Michael Teng; Frank Wood

arXiv:1906.05462·cs.LG·June 16, 2020·1 cites

Near-Optimal Glimpse Sequences for Improved Hard Attention Neural Network Training

William Harvey, Michael Teng, Frank Wood

PDF

Open Access

TL;DR

This paper introduces a Bayesian optimal experimental design approach to generate near-optimal glimpse sequences for hard attention in neural networks, improving training efficiency and reducing variance.

Contribution

It frames hard attention as a BOED problem, proposing a method to generate reusable near-optimal attention sequences to enhance training of neural networks.

Findings

01

Generated near-optimal attention sequences improve training speed.

02

Sequences can be reused across different networks for the same task.

03

Method reduces training variance in hard attention models.

Abstract

Hard visual attention is a promising approach to reduce the computational burden of modern computer vision methodologies. Hard attention mechanisms are typically non-differentiable. They can be trained with reinforcement learning but the high-variance training this entails hinders more widespread application. We show how hard attention for image classification can be framed as a Bayesian optimal experimental design (BOED) problem. From this perspective, the optimal locations to attend to are those which provide the greatest expected reduction in the entropy of the classification distribution. We introduce methodology from the BOED literature to approximate this optimal behaviour, and use it to generate `near-optimal' sequences of attention locations. We then show how to use such sequences to partially supervise, and therefore speed up, the training of a hard attention mechanism.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Image Processing Techniques and Applications

MethodsSPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings