Avoiding local minima in multilayer network optimization by incremental   training

Alberto De Santis; Giampaolo Liuzzi; Stefano Lucidi; Edoardo; Maria Tronci

arXiv:2106.06477·math.OC·June 14, 2021

Avoiding local minima in multilayer network optimization by incremental training

Alberto De Santis, Giampaolo Liuzzi, Stefano Lucidi, Edoardo, Maria Tronci

PDF

Open Access

TL;DR

This paper introduces an incremental training method for deep neural networks that mathematically characterizes and avoids undesirable stationary points, improving training efficiency.

Contribution

It provides a mathematical characterization of stationary points in deep networks and proposes an incremental training algorithm to avoid local minima.

Findings

01

The characterization of stationary points in deep networks.

02

An incremental training algorithm that avoids undesirable stationary points.

03

Improved training efficiency for large neural networks.

Abstract

Training a large multilayer neural network can present many difficulties due to the large number of useless stationary points. These points usually attract the minimization algorithm used during the training phase, which therefore results inefficient. Extending some results proposed in literature for shallow networks, we propose the mathematical characterization of a class of such stationary points that arise in deep neural networks training. Availing such a description, we are able to define an incremental training algorithm that avoids getting stuck in the region of attraction of these undesirable stationary points.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Model Reduction and Neural Networks · Machine Learning and ELM