Beating the Perils of Non-Convexity: Guaranteed Training of Neural   Networks using Tensor Methods

Majid Janzamin; Hanie Sedghi; Anima Anandkumar

arXiv:1506.08473·cs.LG·January 13, 2016·146 cites

Beating the Perils of Non-Convexity: Guaranteed Training of Neural Networks using Tensor Methods

Majid Janzamin, Hanie Sedghi, Anima Anandkumar

PDF

Open Access

TL;DR

This paper introduces a tensor decomposition-based algorithm for guaranteed training of two-layer neural networks, overcoming non-convexity issues and providing risk bounds with polynomial sample complexity.

Contribution

It presents a novel, provably convergent tensor method for neural network training with theoretical guarantees and competitive computational efficiency.

Findings

01

Provably converges to the global optimum under mild conditions

02

Achieves risk bounds with polynomial sample complexity

03

Computationally comparable to stochastic gradient descent

Abstract

Training neural networks is a challenging non-convex optimization problem, and backpropagation or gradient descent can get stuck in spurious local optima. We propose a novel algorithm based on tensor decomposition for guaranteed training of two-layer neural networks. We provide risk bounds for our proposed method, with a polynomial sample complexity in the relevant parameters, such as input dimension and number of neurons. While learning arbitrary target functions is NP-hard, we provide transparent conditions on the function and the input for learnability. Our training method is based on tensor decomposition, which provably converges to the global optimum, under a set of mild non-degeneracy conditions. It consists of simple embarrassingly parallel linear and multi-linear operations, and is competitive with standard stochastic gradient descent (SGD), in terms of computational complexity.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Machine Learning and Algorithms · Stochastic Gradient Optimization Techniques