The Statistical Complexity of Early-Stopped Mirror Descent

Tomas Va\v{s}kevi\v{c}ius; Varun Kanade; Patrick Rebeschini

arXiv:2002.00189·stat.ML·August 28, 2020·5 cites

The Statistical Complexity of Early-Stopped Mirror Descent

Tomas Va\v{s}kevi\v{c}ius, Varun Kanade, Patrick Rebeschini

PDF

Open Access 1 Video

TL;DR

This paper analyzes the statistical properties of early-stopped mirror descent algorithms, linking complexity measures to excess risk guarantees for linear models and kernel methods, and improves upon recent implicit regularization results.

Contribution

It establishes a novel connection between offset Rademacher complexities and mirror descent convergence, providing new excess risk bounds and simplifying proofs of existing results.

Findings

01

Provides excess risk guarantees based on offset complexities.

02

Recovers recent implicit regularization results with shorter proofs.

03

Shows potential improvements over existing bounds in certain settings.

Abstract

Recently there has been a surge of interest in understanding implicit regularization properties of iterative gradient-based optimization algorithms. In this paper, we study the statistical guarantees on the excess risk achieved by early-stopped unconstrained mirror descent algorithms applied to the unregularized empirical risk with the squared loss for linear models and kernel methods. By completing an inequality that characterizes convexity for the squared loss, we identify an intrinsic link between offset Rademacher complexities and potential-based convergence analysis of mirror descent methods. Our observation immediately yields excess risk guarantees for the path traced by the iterates of mirror descent in terms of offset complexities of certain function classes depending only on the choice of the mirror map, initialization point, step-size, and the number of iterations. We apply…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

The Statistical Complexity of Early-Stopped Mirror Descent· slideslive

Taxonomy

TopicsStochastic Gradient Optimization Techniques · Sparse and Compressive Sensing Techniques · Machine Learning and ELM