Convergence Analysis for Training Stochastic Neural Networks via   Stochastic Gradient Descent

Richard Archibald; Feng Bao; Yanzhao Cao; Hui Sun

arXiv:2212.08924·math.NA·December 20, 2022·1 cites

Convergence Analysis for Training Stochastic Neural Networks via Stochastic Gradient Descent

Richard Archibald, Feng Bao, Yanzhao Cao, Hui Sun

PDF

Open Access 1 Repo

TL;DR

This paper proves convergence of a novel sample-wise back-propagation method for stochastic neural networks modeled as SDE discretizations, using stochastic optimal control theory, with validation through numerical experiments.

Contribution

It introduces a new convergence analysis for a sample-wise back-propagation method in SNNs, linking training steps to network depth and employing stochastic control techniques.

Findings

01

Training steps proportional to square of layers in convex case

02

Validation of convergence through numerical experiments

03

Performance demonstrated on benchmark machine learning tasks

Abstract

In this paper, we carry out numerical analysis to prove convergence of a novel sample-wise back-propagation method for training a class of stochastic neural networks (SNNs). The structure of the SNN is formulated as discretization of a stochastic differential equation (SDE). A stochastic optimal control framework is introduced to model the training procedure, and a sample-wise approximation scheme for the adjoint backward SDE is applied to improve the efficiency of the stochastic optimal control solver, which is equivalent to the back-propagation for training the SNN. The convergence analysis is derived with and without convexity assumption for optimization of the SNN parameters. Especially, our analysis indicates that the number of SNN training steps should be proportional to the square of the number of layers in the convex optimization case. Numerical experiments are carried out to…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

huisun317/snn
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and ELM · Stochastic Gradient Optimization Techniques · Model Reduction and Neural Networks