Margin-adaptive model selection in statistical learning

Sylvain Arlot; Peter L. Bartlett

arXiv:0804.2937·math.ST·May 2, 2011

Margin-adaptive model selection in statistical learning

Sylvain Arlot, Peter L. Bartlett

PDF

TL;DR

This paper investigates the challenge of adaptively selecting models based on margin conditions in statistical learning, demonstrating that certain penalization methods adapt well for nested models but not for non-nested models.

Contribution

It introduces the concept of strong margin adaptivity in model selection, proving its feasibility for nested models with data-dependent penalties, and shows its limitations for non-nested models.

Findings

01

Penalization procedures like local Rademacher complexities are adaptively effective for nested models.

02

Strong margin adaptivity cannot be guaranteed for non-nested models across all procedures.

03

Data-dependent penalties enable adaptivity in nested model settings.

Abstract

A classical condition for fast learning rates is the margin condition, first introduced by Mammen and Tsybakov. We tackle in this paper the problem of adaptivity to this condition in the context of model selection, in a general learning framework. Actually, we consider a weaker version of this condition that allows one to take into account that learning within a small model can be much easier than within a large one. Requiring this "strong margin adaptivity" makes the model selection problem more challenging. We first prove, in a general framework, that some penalization procedures (including local Rademacher complexities) exhibit this adaptivity when the models are nested. Contrary to previous results, this holds with penalties that only depend on the data. Our second main result is that strong margin adaptivity is not always possible when the models are not nested: for every model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.