The Exact Asymptotic Form of Bayesian Generalization Error in Latent Dirichlet Allocation
Naoki Hayashi

TL;DR
This paper derives the exact asymptotic form of Bayesian generalization error in Latent Dirichlet Allocation using algebraic geometry, revealing its relation to matrix factorization and the impact of parameter constraints.
Contribution
It provides the first theoretical analysis of LDA's generalization error, connecting it to matrix factorization and accounting for parameter space restrictions.
Findings
Exact asymptotic form of generalization error derived
Error expressed in terms of matrix factorization
Numerical experiments confirm theoretical predictions
Abstract
Latent Dirichlet allocation (LDA) obtains essential information from data by using Bayesian inference. It is applied to knowledge discovery via dimension reducing and clustering in many fields. However, its generalization error had not been yet clarified since it is a singular statistical model where there is no one-to-one mapping from parameters to probability distributions. In this paper, we give the exact asymptotic form of its generalization error and marginal likelihood, by theoretical analysis of its learning coefficient using algebraic geometry. The theoretical result shows that the Bayesian generalization error in LDA is expressed in terms of that in matrix factorization and a penalty from the simplex restriction of LDA's parameter region. A numerical experiment is consistent to the theoretical result.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Methods and Mixture Models · Bayesian Modeling and Causal Inference · Neural Networks and Applications
MethodsLinear Discriminant Analysis
