Asymptotic Model Selection for Naive Bayesian Networks
Dmitry Rusakov, Dan Geiger

TL;DR
This paper derives an asymptotic formula for computing the marginal likelihood of data in naive Bayesian networks with hidden states, highlighting limitations of the BIC score for certain models.
Contribution
It provides a closed-form asymptotic formula for naive Bayesian networks with hidden states, showing BIC's inaccuracy in stratified exponential family models.
Findings
BIC score is not valid for stratified exponential family models.
The derived formula differs from the standard BIC score.
Naive Bayesian networks with hidden states require specialized asymptotic analysis.
Abstract
We develop a closed form asymptotic formula to compute the marginal likelihood of data given a naive Bayesian network model with two hidden states and binary features. This formula deviates from the standard BIC score. Our work provides a concrete example that the BIC score is generally not valid for statistical models that belong to a stratified exponential family. This stands in contrast to linear and curved exponential families, where the BIC score has been proven to provide a correct approximation for the marginal likelihood.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Bayesian Methods and Mixture Models · Statistical Methods and Bayesian Inference
