A faster and simpler algorithm for learning shallow networks
Sitan Chen, Shyam Narayanan

TL;DR
This paper introduces a simplified, faster algorithm for learning shallow neural networks with ReLU activations, improving runtime complexity from a multi-stage approach to a single-stage method with polynomial dependence on parameters.
Contribution
The authors present a one-stage algorithm that matches the performance of previous multi-stage methods but with significantly reduced runtime complexity.
Findings
The new algorithm runs in time (d/ε)^{O(k^2)}.
It simplifies the learning process by eliminating multiple stages.
The approach is effective for small k and high-dimensional data.
Abstract
We revisit the well-studied problem of learning a linear combination of ReLU activations given labeled examples drawn from the standard -dimensional Gaussian measure. Chen et al. [CDG+23] recently gave the first algorithm for this problem to run in time when , where is the target error. More precisely, their algorithm runs in time and learns over multiple stages. Here we show that a much simpler one-stage version of their algorithm suffices, and moreover its runtime is only .
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStochastic Gradient Optimization Techniques · Advanced Graph Neural Networks · Machine Learning in Materials Science
