Optimal Algorithms for Augmented Testing of Discrete Distributions
Maryam Aliakbarpour, Piotr Indyk, Ronitt Rubinfeld, Sandeep Silwal

TL;DR
This paper introduces adaptive algorithms for hypothesis testing of discrete distributions that leverage predictive models to reduce sample complexity, achieving optimal bounds and robustness without prior knowledge of prediction accuracy.
Contribution
It presents novel adaptive algorithms that utilize predicted distributions to improve sample efficiency in hypothesis testing, with proven optimality and robustness.
Findings
Sample complexity reduction depends on predictor quality
Algorithms adaptively self-adjust to prediction accuracy
Experimental results outperform worst-case guarantees
Abstract
We consider the problem of hypothesis testing for discrete distributions. In the standard model, where we have sample access to an underlying distribution , extensive research has established optimal bounds for uniformity testing, identity testing (goodness of fit), and closeness testing (equivalence or two-sample testing). We explore these problems in a setting where a predicted data distribution, possibly derived from historical data or predictive machine learning models, is available. We demonstrate that such a predictor can indeed reduce the number of samples required for all three property testing tasks. The reduction in sample complexity depends directly on the predictor's quality, measured by its total variation distance from . A key advantage of our algorithms is their adaptability to the precision of the prediction. Specifically, our algorithms can self-adjust their…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsFault Detection and Control Systems · Advanced Statistical Process Monitoring · Control Systems and Identification
