Input Normalized Stochastic Gradient Descent Training of Deep Neural   Networks

Salih Atici; Hongyi Pan; Ahmet Enis Cetin

arXiv:2212.09921·cs.LG·June 28, 2023·1 cites

Input Normalized Stochastic Gradient Descent Training of Deep Neural Networks

Salih Atici, Hongyi Pan, Ahmet Enis Cetin

PDF

Open Access 1 Repo

TL;DR

This paper introduces INSGD, a new normalization-based stochastic gradient descent method inspired by NLMS, which improves training stability and accuracy for deep neural networks on large datasets.

Contribution

The paper proposes Input Normalized SGD (INSGD), a novel normalization technique that excludes the error term and normalizes updates with input vectors, enhancing training performance.

Findings

01

INSGD achieves higher accuracy on benchmark datasets.

02

Improves ResNet-18 accuracy on CIFAR-10 from 92.42% to 92.71%.

03

Enhances ResNet-50 accuracy on ImageNet-1K from 75.52% to 75.67%.

Abstract

In this paper, we propose a novel optimization algorithm for training machine learning models called Input Normalized Stochastic Gradient Descent (INSGD), inspired by the Normalized Least Mean Squares (NLMS) algorithm used in adaptive filtering. When training complex models on large datasets, the choice of optimizer parameters, particularly the learning rate, is crucial to avoid divergence. Our algorithm updates the network weights using stochastic gradient descent with $ℓ_{1}$ and $ℓ_{2}$ -based normalizations applied to the learning rate, similar to NLMS. However, unlike existing normalization methods, we exclude the error term from the normalization process and instead normalize the update term using the input vector to the neuron. Our experiments demonstrate that our optimization algorithm achieves higher accuracy levels compared to different initialization settings. We evaluate…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

salihfurkan/normalized-sgd
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAdvanced Neural Network Applications · Machine Learning and ELM · Domain Adaptation and Few-Shot Learning