Very Deep Convolutional Networks for Large-Scale Image Recognition
Karen Simonyan, Andrew Zisserman

TL;DR
This paper demonstrates that increasing the depth of convolutional neural networks with small filters significantly improves image recognition accuracy, leading to state-of-the-art results on large-scale datasets.
Contribution
The paper provides a comprehensive evaluation of very deep convolutional networks with small filters, establishing their superiority over previous architectures.
Findings
Deeper networks with 16-19 layers outperform shallower models.
State-of-the-art results on ImageNet classification and localization.
Public release of top-performing models for further research.
Abstract
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗glasses/vgg11model· 1 dl1 dl
- 🤗glasses/vgg11_bnmodel
- 🤗glasses/vgg13_bnmodel· 1 dl1 dl
- 🤗glasses/vgg19_bnmodel· 1 dl1 dl
- 🤗matthias-wright/vggmodel· ♡ 1♡ 1
- 🤗timm/vgg11.tv_in1kmodel· 567 dl567 dl
- 🤗timm/vgg11_bn.tv_in1kmodel· 1.2k dl1.2k dl
- 🤗timm/vgg13.tv_in1kmodel· 198 dl198 dl
- 🤗timm/vgg13_bn.tv_in1kmodel· 101 dl101 dl
- 🤗timm/vgg16.tv_in1kmodel· 10k dl· ♡ 710k dl♡ 7
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Vision and Imaging · Advanced Neural Network Applications · Image Processing Techniques and Applications
MethodsHow do I message Robinhood? Chat^DiReCT^SuPPOrt · Visual Geometry Group 19 Layer CNN · VGG-16 · Dropout · Step Decay · Weight Decay · SGD with Momentum · Xavier Initialization · Color Jitter · Random Horizontal Flip
