Effective Minkowski Dimension of Deep Nonparametric Regression: Function Approximation and Statistical Theories
Zixuan Zhang, Minshuo Chen, Mengdi Wang, Wenjing Liao, Tuo Zhao

TL;DR
This paper introduces the effective Minkowski dimension to characterize data complexity in deep nonparametric regression, showing neural networks can adapt to this measure and reduce the impact of ambient dimensionality.
Contribution
It proposes a new complexity measure, the effective Minkowski dimension, and establishes its role in determining sample complexity for deep regression under relaxed data assumptions.
Findings
Sample complexity depends on the effective Minkowski dimension p.
For Gaussian designs with eigenvalue decay, p is logarithmic or polynomial in n.
Deep networks adapt to the effective Minkowski dimension, mitigating ambient dimension curse.
Abstract
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to the intrinsic data structures. In real world applications, such an assumption of data lying exactly on a low dimensional manifold is stringent. This paper introduces a relaxed assumption that the input data are concentrated around a subset of denoted by , and the intrinsic dimension of can be characterized by a new complexity notation -- effective Minkowski dimension. We prove that, the sample complexity of deep nonparametric regression only depends on the effective Minkowski dimension of denoted by . We further illustrate our theoretical findings by considering nonparametric regression with an anisotropic Gaussian random design , where is full rank.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Gaussian Processes and Bayesian Inference · Stochastic Gradient Optimization Techniques
