Concentration in the Generalized Chinese Restaurant Process
Alan Pereira, Roberto I. Oliveira, Rodrigo Ribeiro

TL;DR
This paper provides non-asymptotic concentration bounds for the number of parts of specific sizes in the Generalized Chinese Restaurant Process, enhancing understanding of its behavior in regimes where the number of parts grows polynomially with the data size.
Contribution
We establish finite-sample concentration results for the number of parts of size k in the GCRP, complementing existing asymptotic and deviation analyses.
Findings
Number of parts of size k concentrates around c_k V_* n^α
Finite-n bounds for total number of parts are derived
Results complement asymptotic and deviation theorems by prior researchers
Abstract
The Generalized Chinese Restaurant Process (GCRP) describes a sequence of exchangeable random partitions of the numbers . This process is related to the Ewens sampling model in Genetics and to Bayesian nonparametric methods such as topic models. In this paper, we study the GCRP in a regime where the number of parts grows like with . We prove a non-asymptotic concentration result for the number of parts of size . In particular, we show that these random variables concentrate around where is the asymptotic number of parts and is a positive value depending on . We also obtain finite- bounds for the total number of parts. Our theorems complement asymptotic statements by Pitman and more recent results on large and moderate…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Stochastic processes and statistical mechanics · Algorithms and Data Compression
