Characterization of Excess Risk for Locally Strongly Convex Population   Risk

Mingyang Yi; Ruoyu Wang; Zhi-Ming Ma

arXiv:2012.02456·cs.LG·October 11, 2022

Characterization of Excess Risk for Locally Strongly Convex Population Risk

Mingyang Yi, Ruoyu Wang, Zhi-Ming Ma

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper derives bounds on the excess risk of models trained with iterative algorithms under local strong convexity of the population risk, showing good generalization even in high-dimensional non-convex settings.

Contribution

It introduces dimension-insensitive excess risk bounds based on local convexity, applicable to a broad class of algorithms and non-convex problems.

Findings

01

Bound of order (1/n) for convex problems.

02

Order (1/n) for certain non-convex problems with no spurious minima.

03

Bound of order (1/(n)) without no spurious minima assumption.

Abstract

We establish upper bounds for the expected excess risk of models trained by proper iterative algorithms which approximate the local minima. Unlike the results built upon the strong globally strongly convexity or global growth conditions e.g., PL-inequality, we only require the population risk to be \emph{locally} strongly convex around its local minima. Concretely, our bound under convex problems is of order $\tilde{\cO} (1/ n)$ . For non-convex problems with $d$ model parameters such that $d / n$ is smaller than a threshold independent of $n$ , the order of $\tilde{\cO} (1/ n)$ can be maintained if the empirical risk has no spurious local minima with high probability. Moreover, the bound for non-convex problem becomes $\tilde{\cO} (1/ n)$ without such assumption. Our results are derived via algorithmic stability and characterization of the empirical risk's landscape. Compared with the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kuangliu/pytorch-cifar
pytorchOfficial

Videos

Characterization of Excess Risk for Locally Strongly Convex Population Risk· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and Algorithms