A Theory of Dynamic Benchmarks

Ali Shirali; Rediet Abebe; Moritz Hardt

arXiv:2210.03165·cs.LG·March 3, 2023·1 cites

A Theory of Dynamic Benchmarks

Ali Shirali, Rediet Abebe, Moritz Hardt

PDF

Open Access 1 Video

TL;DR

This paper provides a theoretical foundation for dynamic benchmarks, analyzing their potential and limitations through models and simulations, and highlighting how data collection strategies impact model performance over iterative rounds.

Contribution

It introduces the first theoretical models of dynamic benchmarking, analyzing performance progression and limitations, and supports findings with simulations on real datasets.

Findings

01

Model performance improves initially but stalls after few rounds.

02

Label noise exacerbates performance stagnation.

03

Hierarchical data collection models outperform simpler ones.

Abstract

Dynamic benchmarks interweave model fitting and data collection in an attempt to mitigate the limitations of static benchmarks. In contrast to an extensive theoretical and empirical study of the static setting, the dynamic counterpart lags behind due to limited empirical studies and no apparent theoretical foundation to date. Responding to this deficit, we initiate a theoretical study of dynamic benchmarking. We examine two realizations, one capturing current practice and the other modeling more complex settings. In the first model, where data collection and model fitting alternate sequentially, we prove that model performance improves initially but can stall after only three rounds. Label noise arising from, for instance, annotator disagreement leads to even stronger negative results. Our second model generalizes the first to the case where data collection and model fitting have a…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

A Theory of Dynamic Benchmarks· slideslive

Taxonomy

TopicsExplainable Artificial Intelligence (XAI) · Machine Learning and Data Classification · Sports Analytics and Performance