Climbing a shaky ladder: Better adaptive risk estimation

Moritz Hardt

arXiv:1706.02733·cs.LG·June 12, 2017·5 cites

Climbing a shaky ladder: Better adaptive risk estimation

Moritz Hardt

PDF

Open Access

TL;DR

This paper introduces a randomized leaderboard algorithm that reduces overfitting in machine learning benchmarks by achieving a better error rate, and discusses the fundamental challenges and lower bounds in adaptive risk estimation.

Contribution

We present a new randomized algorithm for the leaderboard problem with improved error bounds and analyze the fundamental obstacles to further advancements in adaptive risk estimation.

Findings

01

Our algorithm achieves leaderboard error O(1/n^{0.4})

02

A new attack distinguishes our algorithm from previous methods

03

Improvement in bounds would imply breakthroughs in adaptive estimation theory

Abstract

We revisit the \emph{leaderboard problem} introduced by Blum and Hardt (2015) in an effort to reduce overfitting in machine learning benchmarks. We show that a randomized version of their Ladder algorithm achieves leaderboard error O(1/n^{0.4}) compared with the previous best rate of O(1/n^{1/3}). Short of proving that our algorithm is optimal, we point out a major obstacle toward further progress. Specifically, any improvement to our upper bound would lead to asymptotic improvements in the general adaptive estimation setting as have remained elusive in recent years. This connection also directly leads to lower bounds for specific classes of algorithms. In particular, we exhibit a new attack on the leaderboard algorithm that both theoretically and empirically distinguishes between our algorithm and previous leaderboard algorithms.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsPrivacy-Preserving Technologies in Data · Stochastic Gradient Optimization Techniques · Machine Learning and Algorithms