Newton methods based convolution neural networks using parallel   processing

Ujjwal Thakur; Anuj Sharma

arXiv:2112.01401·cs.LG·April 6, 2023

Newton methods based convolution neural networks using parallel processing

Ujjwal Thakur, Anuj Sharma

PDF

TL;DR

This paper introduces a parallel processing approach for Newton methods in convolutional neural networks, utilizing complete data for Hessian computation to improve training efficiency over previous sub-sampled methods.

Contribution

The paper proposes a novel parallel processing technique for Newton methods in CNNs that uses full data Hessian calculations, enhancing training speed.

Findings

01

Parallel processing reduces training time

02

Using full data Hessian improves accuracy

03

Outperforms previous sub-sampled approaches

Abstract

Training of convolutional neural networks is a high dimensional and a non-convex optimization problem. At present, it is inefficient in situations where parametric learning rates can not be confidently set. Some past works have introduced Newton methods for training deep neural networks. Newton methods for convolutional neural networks involve complicated operations. Finding the Hessian matrix in second-order methods becomes very complex as we mainly use the finite differences method with the image data. Newton methods for convolutional neural networks deals with this by using the sub-sampled Hessian Newton methods. In this paper, we have used the complete data instead of the sub-sampled methods that only handle partial data at a time. Further, we have used parallel processing instead of serial processing in mini-batch computations. The results obtained using parallel processing in this…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.