Conditioning of Random Feature Matrices: Double Descent and   Generalization Error

Zhijun Chen; Hayden Schaeffer

arXiv:2110.11477·stat.ML·November 8, 2021·6 cites

Conditioning of Random Feature Matrices: Double Descent and Generalization Error

Zhijun Chen, Hayden Schaeffer

PDF

Open Access

TL;DR

This paper analyzes the condition number of random feature matrices, showing they are well-conditioned under certain ratios, and links their properties to the double descent phenomenon in generalization error.

Contribution

It provides high-probability bounds on the condition number and restricted isometry constants of random feature matrices, connecting these to the double descent behavior in risk.

Findings

01

Condition number is well-conditioned when N/m scales like log^{-1}(N) or log(m)

02

Risk exhibits double descent behavior linked to the condition number's behavior

03

Risk decreases with increasing N and m, even with noise

Abstract

We provide (high probability) bounds on the condition number of random feature matrices. In particular, we show that if the complexity ratio $\frac{N}{m}$ where $N$ is the number of neurons and $m$ is the number of data samples scales like $lo g^{- 1} (N)$ or $lo g (m)$ , then the random feature matrix is well-conditioned. This result holds without the need of regularization and relies on establishing various concentration bounds between dependent components of the random feature matrix. Additionally, we derive bounds on the restricted isometry constant of the random feature matrix. We prove that the risk associated with regression problems using a random feature matrix exhibits the double descent phenomenon and that this is an effect of the double descent behavior of the condition number. The risk bounds include the underparameterized setting using the least squares problem and the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSparse and Compressive Sensing Techniques · Stochastic Gradient Optimization Techniques · Random Matrices and Applications