TL;DR
This paper demonstrates that deep, large-scale neural networks trained with back-propagation can achieve state-of-the-art accuracy on handwritten digit recognition, leveraging multiple layers, extensive data augmentation, and GPU acceleration.
Contribution
It shows that simple, deep multi-layer perceptrons with extensive training and hardware support outperform previous methods on MNIST.
Findings
0.35% error rate on MNIST
Deep networks with many neurons and layers are highly effective
GPU acceleration significantly speeds up training
Abstract
Good old on-line back-propagation for plain multi-layer perceptrons yields a very low 0.35% error rate on the famous MNIST handwritten digits benchmark. All we need to achieve this best result so far are many hidden layers, many neurons per layer, numerous deformed training images, and graphics cards to greatly speed up learning.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
