$\ell_p$ Testing and Learning of Discrete Distributions

Bo Waggoner

arXiv:1412.2314·cs.DS·March 24, 2015

$\ell_p$ Testing and Learning of Discrete Distributions

Bo Waggoner

PDF

TL;DR

This paper investigates the problems of testing uniformity and learning discrete distributions under general metrics, revealing that for p > 1, these tasks can be accomplished with sample sizes independent of the support size, contrasting with the classic case.

Contribution

It introduces new sample complexity bounds for metrics, showing support-size independence for p > 1, and demonstrates that uniformity testing can be easier with larger supports under certain conditions.

Findings

01

Sample complexity for testing and learning is independent of support size for p > 1.

02

Uniformity testing complexity varies with support size depending on p, easier for larger supports if p > 4/3.

03

The proposed algorithms are order-optimal for all metrics studied.

Abstract

The classic problems of testing uniformity of and learning a discrete distribution, given access to independent samples from it, are examined under general $ℓ_{p}$ metrics. The intuitions and results often contrast with the classic $ℓ_{1}$ case. For $p > 1$ , we can learn and test with a number of samples that is independent of the support size of the distribution: With an $ℓ_{p}$ tolerance $ϵ$ , $O (max {1/ ϵ^{q}, 1/ ϵ^{2}})$ samples suffice for testing uniformity and $O (max {1/ ϵ^{q}, 1/ ϵ^{2}})$ samples suffice for learning, where $q = p / (p - 1)$ is the conjugate of $p$ . As this parallels the intuition that $O (n)$ and $O (n)$ samples suffice for the $ℓ_{1}$ case, it seems that $1/ ϵ^{q}$ acts as an upper bound on the "apparent" support size. For some $ℓ_{p}$ metrics, uniformity testing becomes easier over larger supports: a 6-sided die…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.