A direct proof of a unified law of robustness for Bregman divergence losses
Santanu Das, Jatin Batra, Piyush Srivastava

TL;DR
This paper provides a unified proof demonstrating that overparameterized models are necessary for robust interpolation across a broad class of Bregman divergence losses, extending previous results beyond scalar responses.
Contribution
The authors recast Bubeck and Sellke's proof using a bias-variance decomposition, generalizing it to vector-valued responses and Bregman divergence losses without relying on Rademacher complexity.
Findings
Overparameterization is necessary for robust interpolation with Bregman divergences.
The proof technique is extended to vector-valued responses.
The approach broadens understanding of robustness in deep learning models.
Abstract
In contemporary deep learning practice, models are often trained to near zero loss i.e. to nearly interpolate the training data. However, the number of parameters in the model is usually far more than the number of data points n, the theoretical minimum needed for interpolation: a phenomenon referred to as overparameterization. In an interesting piece of work, Bubeck and Sellke considered a natural notion of interpolation: the model is said to interpolate when the model's training loss goes below the loss of the conditional expectation of the response given the covariate. For this notion of interpolation and for a broad class of covariate distributions (specifically those satisfying a natural notion of concentration of measure), they showed that overparameterization is necessary for robust interpolation i.e. if the interpolating function is required to be Lipschitz. Their main proof…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Statistical Distribution Estimation and Applications · Advanced Statistical Process Monitoring
