Information bottleneck theory of high-dimensional regression: relevancy,   efficiency and optimality

Vudtiwat Ngampruetikorn; David J. Schwab

arXiv:2208.03848·cs.IT·October 13, 2022·1 cites

Information bottleneck theory of high-dimensional regression: relevancy, efficiency and optimality

Vudtiwat Ngampruetikorn, David J. Schwab

PDF

Open Access 1 Video

TL;DR

This paper applies information bottleneck theory to high-dimensional linear regression, analyzing the trade-offs between residual and relevant information, and revealing fundamental limits and phenomena like double descent in learning.

Contribution

It introduces an information-theoretic framework for understanding overfitting and optimality in high-dimensional regression, including new bounds and insights into algorithm efficiency.

Findings

01

Optimal algorithms minimize residual information while maximizing relevant information.

02

Randomized ridge regression's efficiency is compared to optimal algorithms.

03

Reveals information-theoretic analogs of double and multiple descent phenomena.

Abstract

Avoiding overfitting is a central challenge in machine learning, yet many large neural networks readily achieve zero training loss. This puzzling contradiction necessitates new approaches to the study of overfitting. Here we quantify overfitting via residual information, defined as the bits in fitted models that encode noise in training data. Information efficient learning algorithms minimize residual information while maximizing the relevant bits, which are predictive of the unknown generative models. We solve this optimization to obtain the information content of optimal algorithms for a linear regression problem and compare it to that of randomized ridge regression. Our results demonstrate the fundamental trade-off between residual and relevant information and characterize the relative information efficiency of randomized regression with respect to optimal algorithms. Finally, using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Information bottleneck theory of high-dimensional regression: relevancy, efficiency and optimality· slideslive

Taxonomy

TopicsStatistical Mechanics and Entropy · Neural Networks and Applications · Stochastic Gradient Optimization Techniques

MethodsLinear Regression