Minimax Lower Bounds for Ridge Combinations Including Neural Nets
Jason M. Klusowski, Andrew R. Barron

TL;DR
This paper establishes minimax lower bounds for estimating functions using ridge combinations, including neural networks, showing how error rates depend on dimension, sample size, and parameter constraints through information-theoretic analysis.
Contribution
It provides the first minimax lower bounds for ridge combination models, including neural networks, with detailed dependence on dimension, sample size, and parameter norms.
Findings
Error rate scales as (d/n)^{fractional} for small d
Error rate scales as ((log d)/n)^{fractional} for large d
Bounds depend on constraints v_0 and v_1 on parameters
Abstract
Estimation of functions of variables is considered using ridge combinations of the form where the activation function is a function with bounded value and derivative. These include single-hidden layer neural networks, polynomials, and sinusoidal models. From a sample of size of possibly noisy values at random sites , the minimax mean square error is examined for functions in the closure of the hull of ridge functions with activation . It is shown to be of order to a fractional power (when is of smaller order than ), and to be of order to a fractional power (when is of larger order than ). Dependence on constraints and on the norms of inner parameter and outer…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications · Fuzzy Logic and Control Systems · Face and Expression Recognition
