Training Efficient CNNS: Tweaking the Nuts and Bolts of Neural Networks for Lighter, Faster and Robust Models
Sabeesh Ethiraj, Bharath Kumar Bolla

TL;DR
This paper explores techniques to design lightweight, fast, and robust CNNs by systematically applying various optimization and architectural strategies, achieving high accuracy with significantly fewer parameters.
Contribution
It presents a phased approach to build efficient CNNs by combining multiple state-of-the-art techniques, reducing parameters while maintaining high accuracy.
Findings
Achieved 99.2% accuracy on MNIST with only 1500 parameters.
Attained 86.01% accuracy on CIFAR-10 with just over 140K parameters.
Demonstrated the effectiveness of combining multiple efficiency techniques.
Abstract
Deep Learning has revolutionized the fields of computer vision, natural language understanding, speech recognition, information retrieval and more. Many techniques have evolved over the past decade that made models lighter, faster, and robust with better generalization. However, many deep learning practitioners persist with pre-trained models and architectures trained mostly on standard datasets such as Imagenet, MS-COCO, IMDB-Wiki Dataset, and Kinetics-700 and are either hesitant or unaware of redesigning the architecture from scratch that will lead to better performance. This scenario leads to inefficient models that are not suitable on various devices such as mobile, edge, and fog. In addition, these conventional training methods are of concern as they consume a lot of computing power. In this paper, we revisit various SOTA techniques that deal with architecture efficiency (Global…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Anomaly Detection Techniques and Applications
MethodsConvolution · Average Pooling · Attentive Walk-Aggregating Graph Neural Network
