On the Connection Between Learning Two-Layers Neural Networks and Tensor   Decomposition

Marco Mondelli; Andrea Montanari

arXiv:1802.07301·cs.LG·October 11, 2018·23 cites

On the Connection Between Learning Two-Layers Neural Networks and Tensor Decomposition

Marco Mondelli, Andrea Montanari

PDF

Open Access

TL;DR

This paper links the difficulty of learning two-layer neural networks with tensor decomposition, showing computational hardness under certain conditions and activation functions, and highlighting tensor methods as fundamental to the problem.

Contribution

It establishes a complexity-theoretic connection between neural network learning and tensor decomposition, providing hardness results for polynomial activations under standard assumptions.

Findings

01

Polynomial-time algorithms cannot outperform trivial predictors in certain regimes

02

Tensor decomposition methods are central to learning two-layer networks

03

Hardness results extend to higher degree activations and non-random weights

Abstract

We establish connections between the problem of learning a two-layer neural network and tensor decomposition. We consider a model with feature vectors $x \in R^{d}$ , $r$ hidden units with weights ${w_{i}}_{1 \leq i \leq r}$ and output $y \in R$ , i.e., $y = \sum_{i = 1}^{r} σ (w_{i}^{T} x)$ , with activation functions given by low-degree polynomials. In particular, if $σ (x) = a_{0} + a_{1} x + a_{3} x^{3}$ , we prove that no polynomial-time learning algorithm can outperform the trivial predictor that assigns to each example the response variable $E (y)$ , when $d^{3/2} ≪ r ≪ d^{2}$ . Our conclusion holds for a `natural data distribution', namely standard Gaussian feature vectors $x$ , and output distributed according to a two-layer neural network with random isotropic weights, and under a certain complexity-theoretic…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTensor decomposition and applications · Model Reduction and Neural Networks · Machine Learning and ELM