Spatially Adaptive Computation Time for Residual Networks

Michael Figurnov; Maxwell D. Collins; Yukun Zhu; Li Zhang; Jonathan; Huang; Dmitry Vetrov; Ruslan Salakhutdinov

arXiv:1612.02297·cs.CV·July 4, 2017·25 cites

Spatially Adaptive Computation Time for Residual Networks

Michael Figurnov, Maxwell D. Collins, Yukun Zhu, Li Zhang, Jonathan, Huang, Dmitry Vetrov, Ruslan Salakhutdinov

PDF

Open Access 1 Repo

TL;DR

This paper introduces a Residual Network architecture that adaptively adjusts computation per image region, improving efficiency and aligning well with human visual attention across various vision tasks.

Contribution

It presents a novel spatially adaptive computation method for Residual Networks that is end-to-end trainable and applicable to multiple computer vision tasks.

Findings

01

Improved computational efficiency on ImageNet and COCO datasets.

02

Computation time maps correlate with human eye fixations.

03

Applicable to diverse vision problems without modifications.

Abstract

This paper proposes a deep learning architecture based on Residual Network that dynamically adjusts the number of executed layers for the regions of the image. This architecture is end-to-end trainable, deterministic and problem-agnostic. It is therefore applicable without any modifications to a wide range of computer vision problems such as image classification, object detection and image segmentation. We present experimental results showing that this model improves the computational efficiency of Residual Networks on the challenging ImageNet classification and COCO object detection datasets. Additionally, we evaluate the computation time maps on the visual saliency dataset cat2000 and find that they correlate surprisingly well with human eye fixation positions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mfigurnov/sact
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVisual Attention and Saliency Detection · Advanced Neural Network Applications · Domain Adaptation and Few-Shot Learning