A Tutorial on Neural Networks and Gradient-free Training

Turibius Rozario; Arjun Trivedi; Ankit Goel

arXiv:2211.17217·eess.SY·December 1, 2022

A Tutorial on Neural Networks and Gradient-free Training

Turibius Rozario, Arjun Trivedi, Ankit Goel

PDF

Open Access

TL;DR

This paper provides a matrix-based, tutorial-style overview of neural networks, explaining their mathematical structure and comparing gradient-based and gradient-free training methods.

Contribution

It introduces a compact matrix representation of neural networks and analyzes both gradient-based and gradient-free training approaches.

Findings

01

Gradient-free methods are compared with gradient-based training.

02

Neural networks are represented as compositions of linear and nonlinear functions.

03

The paper discusses convergence and accuracy of different training methods.

Abstract

This paper presents a compact, matrix-based representation of neural networks in a self-contained tutorial fashion. Specifically, we develop neural networks as a composition of several vector-valued functions. Although neural networks are well-understood pictorially in terms of interconnected neurons, neural networks are mathematical nonlinear functions constructed by composing several vector-valued functions. Using basic results from linear algebra, we represent a neural network as an alternating sequence of linear maps and scalar nonlinear functions, also known as activation functions. The training of neural networks requires the minimization of a cost function, which in turn requires the computation of a gradient. Using basic multivariable calculus results, the cost gradient is also shown to be a function composed of a sequence of linear maps and nonlinear functions. In addition to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks