AdaInject: Injection Based Adaptive Gradient Descent Optimizers for Convolutional Neural Networks
Shiv Ram Dubey, S.H. Shabbeer Basha, Satish Kumar Singh, Bidyut Baran, Chaudhuri

TL;DR
AdaInject introduces a novel injection mechanism into gradient descent optimizers for CNNs, reducing overshooting and oscillations, and improving convergence and accuracy across multiple benchmarks.
Contribution
The paper proposes AdaInject, a generic injection-based method that enhances existing SGD optimizers by incorporating second order moments, leading to better training stability and performance.
Findings
Significant reduction in classification error rates, up to 16.54% on CIFAR10.
Effective integration with multiple state-of-the-art optimizers.
Consistent performance improvements across various CNN models and datasets.
Abstract
The convolutional neural networks (CNNs) are generally trained using stochastic gradient descent (SGD) based optimization techniques. The existing SGD optimizers generally suffer with the overshooting of the minimum and oscillation near minimum. In this paper, we propose a new approach, hereafter referred as AdaInject, for the gradient descent optimizers by injecting the second order moment into the first order moment. Specifically, the short-term change in parameter is used as a weight to inject the second order moment in the update rule. The AdaInject optimizer controls the parameter update, avoids the overshooting of the minimum and reduces the oscillation near minimum. The proposed approach is generic in nature and can be integrated with any existing SGD optimizer. The effectiveness of the AdaInject optimizer is explained intuitively as well as through some toy examples. We also…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and ELM · Advanced Neural Network Applications · Brain Tumor Detection and Classification
MethodsStochastic Gradient Descent
