Rethinking the Inception Architecture for Computer Vision
Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jonathon Shlens,, Zbigniew Wojna

TL;DR
This paper proposes a rethought Inception architecture for computer vision that improves efficiency and accuracy by using factorized convolutions and regularization, achieving state-of-the-art results with fewer parameters.
Contribution
The paper introduces a novel Inception-based network design that enhances computational efficiency and accuracy, setting new benchmarks on the ImageNet classification challenge.
Findings
Achieved 21.2% top-1 error with 5 billion multiply-adds.
Reduced model size to less than 25 million parameters.
Ensemble of models reached 3.5% top-5 error on validation.
Abstract
Convolutional networks are at the core of most state-of-the-art computer vision solutions for a wide variety of tasks. Since 2014 very deep convolutional networks started to become mainstream, yielding substantial gains in various benchmarks. Although increased model size and computational cost tend to translate to immediate quality gains for most tasks (as long as enough labeled data is provided for training), computational efficiency and low parameter count are still enabling factors for various use cases such as mobile vision and big-data scenarios. Here we explore ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization. We benchmark our methods on the ILSVRC 2012 classification challenge validation set demonstrate substantial gains over the state of the art: 21.2% top-1…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
- 🤗litert-community/inception_v3model· 116 dl· ♡ 3116 dl♡ 3
- 🤗kadirnar/timm_model_listmodel· ♡ 1♡ 1
- 🤗timm/inception_v3.gluon_in1kmodel· 1.5k dl· ♡ 11.5k dl♡ 1
- 🤗timm/inception_v3.tf_adv_in1kmodel· 13k dl· ♡ 113k dl♡ 1
- 🤗timm/inception_v3.tf_in1kmodel· 764 dl764 dl
- 🤗timm/inception_v3.tv_in1kmodel· 16k dl· ♡ 116k dl♡ 1
- 🤗dg845/consistency-model-pipelinesmodel· 28 dl· ♡ 128 dl♡ 1
- 🤗dg845/diffusers-cd_bedroom256_l2model· 5 dl5 dl
- 🤗dg845/diffusers-cd_cat256_l2model· 2 dl2 dl
- 🤗dg845/diffusers-cd_imagenet64_lpipsmodel· 6 dl6 dl
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Neural Network Applications · Adversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications
MethodsHow to speak directly on Robinhood app? Speak^YoUR^CHAt · 10 Ways to Speak to a Human at Expedia: A Step-by-Step Guide · Seven Ways to Contact How can i speak to someone at Expedia: A Step by Step Guide · 10 Ways to Speak to a Human at Expedi𝙖: A Step by step Guide · 1x1 Convolution · Average Pooling · Auxiliary Classifier · Convolution · Dense Connections · Max Pooling
