Risk Bounds for High-dimensional Ridge Function Combinations Including   Neural Networks

Jason M. Klusowski; Andrew R. Barron

arXiv:1607.01434·math.ST·October 31, 2018·37 cites

Risk Bounds for High-dimensional Ridge Function Combinations Including Neural Networks

Jason M. Klusowski, Andrew R. Barron

PDF

Open Access

TL;DR

This paper derives risk bounds for high-dimensional ridge function combinations, including neural networks, showing small estimation error even when input dimension exceeds sample size, by analyzing penalized estimators with various smooth ridge functions.

Contribution

It provides new risk bounds for ridge function estimators in high-dimensional settings, including neural networks, with bounds that hold even when the input dimension is very large.

Findings

01

Risk bounds depend on spectral norm and sample size.

02

Estimates remain accurate even when input dimension exceeds sample size.

03

Bounds improve over traditional rates when dimension is large.

Abstract

Let $f^{⋆}$ be a function on $R^{d}$ with an assumption of a spectral norm $v_{f^{⋆}}$ . For various noise settings, we show that $E ∥ \hat{f} - f^{⋆} ∥^{2} \leq (v_{f^{⋆}}^{4} \frac{l o g d}{n})^{1/3}$ , where $n$ is the sample size and $\hat{f}$ is either a penalized least squares estimator or a greedily obtained version of such using linear combinations of sinusoidal, sigmoidal, ramp, ramp-squared or other smooth ridge functions. The candidate fits may be chosen from a continuum of functions, thus avoiding the rigidity of discretizations of the parameter space. On the other hand, if the candidate fits are chosen from a discretization, we show that $E ∥ \hat{f} - f^{⋆} ∥^{2} \leq (v_{f^{⋆}}^{3} \frac{l o g d}{n})^{2/5}$ . This work bridges non-linear and non-parametric function estimation and includes single-hidden…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsModel Reduction and Neural Networks · Probabilistic and Robust Engineering Design · Image and Signal Denoising Methods