On the Modeling of Error Functions as High Dimensional Landscapes for   Weight Initialization in Learning Networks

Julius; Gopinath Mahale; Sumana T.; C. S. Adityakrishna

arXiv:1607.06011·cs.LG·July 21, 2016

On the Modeling of Error Functions as High Dimensional Landscapes for Weight Initialization in Learning Networks

Julius, Gopinath Mahale, Sumana T., C. S. Adityakrishna

PDF

Open Access

TL;DR

This paper models the error function of deep neural networks as a high-dimensional landscape using Random Matrix Theory, aiming to improve weight initialization for better learning performance.

Contribution

It introduces a novel approach to weight initialization by analyzing the error landscape with high-dimensional modeling and theoretical insights from Random Matrix Theory.

Findings

01

Error functions can be modeled as high-dimensional landscapes.

02

Theoretical analysis provides insights into the error landscape structure.

03

Improved initial weight guesses enhance learning efficiency.

Abstract

Next generation deep neural networks for classification hosted on embedded platforms will rely on fast, efficient, and accurate learning algorithms. Initialization of weights in learning networks has a great impact on the classification accuracy. In this paper we focus on deriving good initial weights by modeling the error function of a deep neural network as a high-dimensional landscape. We observe that due to the inherent complexity in its algebraic structure, such an error function may conform to general results of the statistics of large systems. To this end we apply some results from Random Matrix Theory to analyse these functions. We model the error function in terms of a Hamiltonian in N-dimensions and derive some theoretical results about its general behavior. These results are further used to make better initial guesses of weights for the learning algorithm.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Statistical Mechanics and Entropy · Markov Chains and Monte Carlo Methods