Nearly Optimal Learning using Sparse Deep ReLU Networks in Regularized   Empirical Risk Minimization with Lipschitz Loss

Ke Huang; Mingming Liu; Shujie Ma

arXiv:2108.05990·stat.ME·December 11, 2024·Neural Comput.

Nearly Optimal Learning using Sparse Deep ReLU Networks in Regularized Empirical Risk Minimization with Lipschitz Loss

Ke Huang, Mingming Liu, Shujie Ma

PDF

Open Access

TL;DR

This paper introduces a sparse deep ReLU network estimator for regression that achieves nearly optimal convergence rates, effectively balancing model complexity and overfitting in high-dimensional settings.

Contribution

It develops a novel SDRN estimator with non-asymptotic risk bounds, demonstrating near-minimax optimal rates and efficient depth growth relative to sample size.

Findings

01

Achieves nearly optimal minimax convergence rates for regression.

02

Depth of the network grows logarithmically with sample size.

03

Fewer parameters needed for deep networks to prevent overfitting.

Abstract

We propose a sparse deep ReLU network (SDRN) estimator of the regression function obtained from regularized empirical risk minimization with a Lipschitz loss function. Our framework can be applied to a variety of regression and classification problems. We establish novel non-asymptotic excess risk bounds for our SDRN estimator when the regression function belongs to a Sobolev space with mixed derivatives. We obtain a new nearly optimal risk rate in the sense that the SDRN estimator can achieve nearly the same optimal minimax convergence rate as one-dimensional nonparametric regression with the dimension only involved in a logarithm term when the feature dimension is fixed. The estimator has a slightly slower rate when the dimension grows with the sample size. We show that the depth of the SDRN estimator grows with the sample size in logarithmic order, and the total number of nodes and…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications · Anomaly Detection Techniques and Applications · Fault Detection and Control Systems