Chebyshev polynomials, moment matching, and optimal estimation of the unseen
Yihong Wu, Pengkun Yang

TL;DR
This paper introduces a Chebyshev polynomial-based estimator for support size of discrete distributions, achieving near-optimal sample complexity and improved accuracy over previous methods, with efficient computation and practical validation.
Contribution
The paper presents a novel linear estimator using Chebyshev polynomials that improves sample complexity bounds and computational efficiency for support size estimation.
Findings
Achieves sample complexity within a factor of six of the theoretical minimum.
Outperforms previous methods in accuracy and efficiency on synthetic and real data.
Provides a scalable and practical estimation procedure.
Abstract
We consider the problem of estimating the support size of a discrete distribution whose minimum non-zero mass is at least . Under the independent sampling model, we show that the sample complexity, i.e., the minimal sample size to achieve an additive error of with probability at least 0.1 is within universal constant factors of , which improves the state-of-the-art result of in \cite{VV13}. Similar characterization of the minimax risk is also obtained. Our procedure is a linear estimator based on the Chebyshev polynomial and its approximation-theoretic properties, which can be evaluated in time and attains the sample complexity within a factor of six asymptotically. The superiority of the proposed estimator in terms of accuracy, computational efficiency and scalability…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Chebyshev Polynomials, Moment Matching and Optimal Estimation of the Unseen· youtube
Taxonomy
TopicsBayesian Methods and Mixture Models · Markov Chains and Monte Carlo Methods · Statistical Methods and Inference
