Thinning to improve two-sample discrepancy
Gleb Smirnov, Roman Vershynin

TL;DR
This paper introduces an online thinning algorithm that significantly reduces the discrepancy between two samples from the same distribution, from order ( ext{n}) to polylogarithmic in n, by discarding a small fraction of points.
Contribution
The paper presents a novel online method for thinning samples to drastically decrease discrepancy, improving upon the typical ( ext{n}) order.
Findings
Discrepancy reduced from O(( ext{n})) to O(( ext{log}^{2d} n))
Algorithm is simple and online, suitable for real-time applications
Small fraction of points discarded to achieve discrepancy reduction
Abstract
The discrepancy between two independent samples \(X_1,\dots,X_n\) and \(Y_1,\dots,Y_n\) drawn from the same distribution on typically has order \(O(\sqrt{n})\) even in one dimension. We give a simple online algorithm that reduces the discrepancy to \(O(\log^{2d} n)\) by discarding a small fraction of the points.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical Methods in Clinical Trials
