Factoring Multidimensional Data to Create a Sophisticated Bayes   Classifier

Anthony LaTorre

arXiv:2105.05181·cs.LG·May 19, 2021

Factoring Multidimensional Data to Create a Sophisticated Bayes Classifier

Anthony LaTorre

PDF

Open Access

TL;DR

This paper introduces a method to compute the marginal likelihood of data factorizations, enabling the selection of optimal variable groupings for constructing more effective Bayesian classifiers.

Contribution

It derives an explicit formula for the marginal likelihood of data factorizations, facilitating the identification of the best variable partition for Bayesian classification.

Findings

01

Explicit formula for marginal likelihood of factorizations

02

Method to rank factorizations based on likelihoods

03

Improved Bayesian classifier construction

Abstract

In this paper we derive an explicit formula for calculating the marginal likelihood of a given factorization of a categorical dataset. Since the marginal likelihood is proportional to the posterior probability of the factorization, these likelihoods can be used to order all possible factorizations and select the "best" way to factor the overall distribution from which the dataset is drawn. The best factorization can then be used to construct a Bayes classifier which benefits from factoring out mutually independent sets of variables.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Data Mining Algorithms and Applications · Data Management and Algorithms