Intelligent gradient amplification for deep neural networks

Sunitha Basodi; Krishna Pusuluri; Xueli Xiao; Yi Pan

arXiv:2305.18445·cs.LG·May 31, 2023·1 cites

Intelligent gradient amplification for deep neural networks

Sunitha Basodi, Krishna Pusuluri, Xueli Xiao, Yi Pan

PDF

Open Access

TL;DR

This paper introduces an intelligent gradient amplification method that selectively enhances gradients in deep neural networks, improving training efficiency and accuracy, especially at higher learning rates, by analyzing gradient fluctuations during training.

Contribution

It proposes a novel approach to identify layers for gradient amplification based on gradient fluctuations, addressing vanishing gradients and training speed simultaneously.

Findings

01

Achieves approximately 2.5% accuracy improvement on CIFAR-10.

02

Achieves approximately 4.5% accuracy improvement on CIFAR-100.

03

Enhances performance even with higher learning rates.

Abstract

Deep learning models offer superior performance compared to other machine learning techniques for a variety of tasks and domains, but pose their own challenges. In particular, deep learning models require larger training times as the depth of a model increases, and suffer from vanishing gradients. Several solutions address these problems independently, but there have been minimal efforts to identify an integrated solution that improves the performance of a model by addressing vanishing gradients, as well as accelerates the training process to achieve higher performance at larger learning rates. In this work, we intelligently determine which layers of a deep learning model to apply gradient amplification to, using a formulated approach that analyzes gradient fluctuations of layers during training. Detailed experiments are performed for simpler and deeper neural networks using two…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsDomain Adaptation and Few-Shot Learning · Advanced Neural Network Applications · Machine Learning and ELM