A Rademacher Complexity Based Method fo rControlling Power and Confidence Level in Adaptive Statistical Analysis
Lorenzo De Stefani, Eli Upfal

TL;DR
This paper introduces RadaBound, a method leveraging Rademacher Complexity to control generalization error in adaptive statistical testing, addressing the challenge of dependent tests on the same holdout data.
Contribution
The paper presents RadaBound, a novel, practical procedure that extends Rademacher Complexity bounds to adaptive testing scenarios with dependent data.
Findings
RadaBound effectively controls error rates in adaptive testing.
The method demonstrates high statistical power in simulations.
Compared to existing approaches, RadaBound offers improved reliability.
Abstract
While standard statistical inference techniques and machine learning generalization bounds assume that tests are run on data selected independently of the hypotheses, practical data analysis and machine learning are usually iterative and adaptive processes where the same holdout data is often used for testing a sequence of hypotheses (or models), which may each depend on the outcome of the previous tests on the same data. In this work, we present RadaBound a rigorous, efficient and practical procedure for controlling the generalization error when using a holdout sample for multiple adaptive testing. Our solution is based on a new application of the Rademacher Complexity generalization bounds, adapted to dependent tests. We demonstrate the statistical power and practicality of our method through extensive simulations and comparisons to alternative approaches.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Algorithms · Statistical Methods and Inference · Stochastic Gradient Optimization Techniques
