Efficiently Learning One-Hidden-Layer ReLU Networks via Schur   Polynomials

Ilias Diakonikolas; Daniel M. Kane

arXiv:2307.12840·cs.LG·July 26, 2023

Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials

Ilias Diakonikolas, Daniel M. Kane

PDF

Open Access

TL;DR

This paper presents an efficient algorithm for PAC learning one-hidden-layer ReLU networks with Gaussian inputs, leveraging Schur polynomials and tensor decomposition to achieve near-optimal complexity within a certain class.

Contribution

The paper introduces a novel algorithm with significantly improved complexity for learning ReLU networks, utilizing Schur polynomial theory and tensor methods.

Findings

01

Algorithm achieves complexity $(dk/)^{O(k)}$, improving over previous super-polynomial bounds.

02

Uses tensor decomposition to identify subspaces with small higher-order moments.

03

Analysis shows higher-moment tensors are small when lower-order moments are controlled.

Abstract

We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $R^{d}$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(d k / ϵ)^{O (k)}$ , where $ϵ > 0$ is the target accuracy. Prior work had given an algorithm for this problem with complexity $(d k / ϵ)^{h (k)}$ , where the function $h (k)$ scales super-polynomially in $k$ . Interestingly, the complexity of our algorithm is near-optimal within the class of Correlational Statistical Query algorithms. At a high-level, our algorithm uses tensor decomposition to identify a subspace such that all the $O (k)$ -order moments are small in the orthogonal directions. Its analysis makes essential use of the theory of Schur polynomials to show that the higher-moment error tensors…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Statistical Methods and Bayesian Inference · Random Matrices and Applications