Finite Sample Identification of Wide Shallow Neural Networks with Biases

Massimo Fornasier; Timo Klock; Marco Mondelli; Michael Rauchensteiner

arXiv:2211.04589·cs.LG·November 10, 2022

Finite Sample Identification of Wide Shallow Neural Networks with Biases

Massimo Fornasier, Timo Klock, Marco Mondelli, Michael Rauchensteiner

PDF

Open Access

TL;DR

This paper develops a new method with theoretical guarantees for identifying parameters of wide shallow neural networks with biases from finite samples, addressing a gap in existing literature.

Contribution

It introduces a constructive two-step approach for finite sample identification of wide shallow neural networks with biases, including theoretical analysis and empirical validation.

Findings

01

Effective parameter recovery demonstrated through numerical experiments.

02

Theoretical guarantees established for the proposed identification method.

03

Addresses a previously unresolved case of networks with biases in the finite sample setting.

Abstract

Artificial neural networks are functions depending on a finite number of parameters typically encoded as weights and biases. The identification of the parameters of the network from finite samples of input-output pairs is often referred to as the \emph{teacher-student model}, and this model has represented a popular framework for understanding training and generalization. Even if the problem is NP-complete in the worst case, a rapidly growing literature -- after adding suitable distributional assumptions -- has established finite sample identification of two-layer networks with a number of neurons $m = O (D)$ , $D$ being the input dimension. For the range $D < m < D^{2}$ the problem becomes harder, and truly little is known for networks parametrized by biases as well. This paper fills the gap by providing constructive methods and theoretical guarantees of finite sample identification for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Algorithms · Stochastic Gradient Optimization Techniques · Model Reduction and Neural Networks