Factoring Multidimensional Data to Create a Sophisticated Bayes Classifier
Anthony LaTorre

TL;DR
This paper introduces a method to compute the marginal likelihood of data factorizations, enabling the selection of optimal variable groupings for constructing more effective Bayesian classifiers.
Contribution
It derives an explicit formula for the marginal likelihood of data factorizations, facilitating the identification of the best variable partition for Bayesian classification.
Findings
Explicit formula for marginal likelihood of factorizations
Method to rank factorizations based on likelihoods
Improved Bayesian classifier construction
Abstract
In this paper we derive an explicit formula for calculating the marginal likelihood of a given factorization of a categorical dataset. Since the marginal likelihood is proportional to the posterior probability of the factorization, these likelihoods can be used to order all possible factorizations and select the "best" way to factor the overall distribution from which the dataset is drawn. The best factorization can then be used to construct a Bayes classifier which benefits from factoring out mutually independent sets of variables.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBayesian Modeling and Causal Inference · Data Mining Algorithms and Applications · Data Management and Algorithms
