A Unified Framework for Training Neural Networks

Hadi Ghauch; Hossein Shokri-Ghadikolaei; Carlo Fischione; Mikael; Skoglund

arXiv:1805.09214·cs.LG·May 24, 2018·1 cites

A Unified Framework for Training Neural Networks

Hadi Ghauch, Hossein Shokri-Ghadikolaei, Carlo Fischione, Mikael, Skoglund

PDF

Open Access

TL;DR

This paper introduces a comprehensive optimization framework that unifies the analysis of training algorithms for various deep neural network architectures, establishing convergence under broad conditions.

Contribution

It presents a unified convergence analysis framework for training different DNNs, encompassing various loss functions, activations, and regularizations, generalizing existing methods.

Findings

01

Framework guarantees convergence for multiple DNN architectures.

02

Unifies analysis of first- and second-order training methods.

03

Applicable to regression and classification tasks.

Abstract

The lack of mathematical tractability of Deep Neural Networks (DNNs) has hindered progress towards having a unified convergence analysis of training algorithms, in the general setting. We propose a unified optimization framework for training different types of DNNs, and establish its convergence for arbitrary loss, activation, and regularization functions, assumed to be smooth. We show that framework generalizes well-known first- and second-order training methods, and thus allows us to show the convergence of these methods for various DNN architectures and learning tasks, as a special case of our approach. We discuss some of its applications in training various DNN architectures (e.g., feed-forward, convolutional, linear networks), to regression and classification tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Gaussian Processes and Bayesian Inference · Model Reduction and Neural Networks