On the optimality of the aggregate with exponential weights for low   temperatures

Guillaume Lecu\'e; Shahar Mendelson

arXiv:1303.5180·math.ST·March 22, 2013

On the optimality of the aggregate with exponential weights for low temperatures

Guillaume Lecu\'e, Shahar Mendelson

PDF

TL;DR

This paper investigates the aggregate with exponential weights (AEW) in regression, revealing its suboptimality at low temperatures unless a Bernstein condition is met, where it becomes optimal with a refined complexity measure.

Contribution

The paper characterizes the conditions under which AEW is optimal or suboptimal in low-temperature regimes, introducing a Bernstein condition for optimality.

Findings

01

AEW is suboptimal in expectation for low temperatures.

02

AEW may concentrate around suboptimal functions with high probability.

03

Under Bernstein condition, AEW achieves optimality in low-temperature regimes.

Abstract

Given a finite class of functions F, the problem of aggregation is to construct a procedure with a risk as close as possible to the risk of the best element in the class. A classical procedure (PAC-Bayesian statistical learning theory (2004) Paris 6, Statistical Learning Theory and Stochastic Optimization (2001) Springer, Ann. Statist. 28 (2000) 75-87) is the aggregate with exponential weights (AEW), defined by \[\tilde{f}^{\mathrm{AEW}}=\sum_{f\in F}\hat{\theta}(f)f,\qquad where \hat{\theta}(f)=\frac{\exp(-({n}/{T})R_n(f))}{\sum_{g\in F}\exp(-({n}/{T})R_n(g))},\] where $T > 0$ is called the temperature parameter and $R_{n} (\cdot)$ is an empirical risk. In this article, we study the optimality of the AEW in the regression model with random design and in the low-temperature regime. We prove three properties of AEW. First, we show that AEW is a suboptimal aggregation procedure in expectation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.