Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan; Andrew Zisserman

arXiv:1409.1556·cs.CV·April 13, 2015·ICLR·76k cites

Very Deep Convolutional Networks for Large-Scale Image Recognition

Karen Simonyan, Andrew Zisserman

PDF

Open Access 5 Repos 10 Models 1 Datasets

TL;DR

This paper demonstrates that increasing the depth of convolutional neural networks with small filters significantly improves image recognition accuracy, leading to state-of-the-art results on large-scale datasets.

Contribution

The paper provides a comprehensive evaluation of very deep convolutional networks with small filters, establishing their superiority over previous architectures.

Findings

01

Deeper networks with 16-19 layers outperform shallower models.

02

State-of-the-art results on ImageNet classification and localization.

03

Public release of top-performing models for further research.

Abstract

In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

avduarte333/arXivTection
dataset· 761 dl
761 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Vision and Imaging · Advanced Neural Network Applications · Image Processing Techniques and Applications

MethodsHow do I message Robinhood? Chat^DiReCT^SuPPOrt · Visual Geometry Group 19 Layer CNN · VGG-16 · Dropout · Step Decay · Weight Decay · SGD with Momentum · Xavier Initialization · Color Jitter · Random Horizontal Flip