Intelligent gradient amplification for deep neural networks
Sunitha Basodi, Krishna Pusuluri, Xueli Xiao, Yi Pan

TL;DR
This paper introduces an intelligent gradient amplification method that selectively enhances gradients in deep neural networks, improving training efficiency and accuracy, especially at higher learning rates, by analyzing gradient fluctuations during training.
Contribution
It proposes a novel approach to identify layers for gradient amplification based on gradient fluctuations, addressing vanishing gradients and training speed simultaneously.
Findings
Achieves approximately 2.5% accuracy improvement on CIFAR-10.
Achieves approximately 4.5% accuracy improvement on CIFAR-100.
Enhances performance even with higher learning rates.
Abstract
Deep learning models offer superior performance compared to other machine learning techniques for a variety of tasks and domains, but pose their own challenges. In particular, deep learning models require larger training times as the depth of a model increases, and suffer from vanishing gradients. Several solutions address these problems independently, but there have been minimal efforts to identify an integrated solution that improves the performance of a model by addressing vanishing gradients, as well as accelerates the training process to achieve higher performance at larger learning rates. In this work, we intelligently determine which layers of a deep learning model to apply gradient amplification to, using a formulated approach that analyzes gradient fluctuations of layers during training. Detailed experiments are performed for simpler and deeper neural networks using two…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Machine Learning and ELM
