Nonparametric Learning of Two-Layer ReLU Residual Units
Zhunxuan Wang, Linyun He, Chunchuan Lyu, Shay B. Cohen

TL;DR
This paper introduces a nonparametric, convex optimization-based algorithm for learning two-layer residual neural networks with ReLU activations, demonstrating strong theoretical guarantees and empirical performance.
Contribution
It develops a layer-wise convex optimization framework for learning residual units, providing statistical consistency and efficient solution methods.
Findings
Algorithm is statistically consistent.
Solutions can be efficiently obtained via linear programming.
Effective on synthetic and benchmark datasets.
Abstract
We describe an algorithm that learns two-layer residual units using rectified linear unit (ReLU) activation: suppose the input is from a distribution with support space and the ground-truth generative model is a residual unit of this type, given by , where ground-truth network parameters represent a full-rank matrix with nonnegative entries and is full-rank with and for , . We design layer-wise objectives as functionals whose analytic minimizers express the exact ground-truth network in terms of its parameters and nonlinearities. Following this objective landscape, learning…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Sparse and Compressive Sensing Techniques · Machine Learning and Data Classification
Methods*Communicated@Fast*How Do I Communicate to Expedia?
