A faster and simpler algorithm for learning shallow networks

Sitan Chen; Shyam Narayanan

arXiv:2307.12496·cs.LG·July 25, 2023

A faster and simpler algorithm for learning shallow networks

Sitan Chen, Shyam Narayanan

PDF

Open Access

TL;DR

This paper introduces a simplified, faster algorithm for learning shallow neural networks with ReLU activations, improving runtime complexity from a multi-stage approach to a single-stage method with polynomial dependence on parameters.

Contribution

The authors present a one-stage algorithm that matches the performance of previous multi-stage methods but with significantly reduced runtime complexity.

Findings

01

The new algorithm runs in time (d/ε)^{O(k^2)}.

02

It simplifies the learning process by eliminating multiple stages.

03

The approach is effective for small k and high-dimensional data.

Abstract

We revisit the well-studied problem of learning a linear combination of $k$ ReLU activations given labeled examples drawn from the standard $d$ -dimensional Gaussian measure. Chen et al. [CDG+23] recently gave the first algorithm for this problem to run in $poly (d, 1/ ε)$ time when $k = O (1)$ , where $ε$ is the target error. More precisely, their algorithm runs in time $(d / ε)^{quasipoly (k)}$ and learns over multiple stages. Here we show that a much simpler one-stage version of their algorithm suffices, and moreover its runtime is only $(d / ε)^{O (k^{2})}$ .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Advanced Graph Neural Networks · Machine Learning in Materials Science