Naive Bayes Classifiers and One-hot Encoding of Categorical Variables

Christopher K. I. Williams

arXiv:2404.18190·cs.LG·April 30, 2024·1 cites

Naive Bayes Classifiers and One-hot Encoding of Categorical Variables

Christopher K. I. Williams

PDF

Open Access

TL;DR

This paper examines the impact of incorrectly encoding categorical variables as one-hot vectors in Naive Bayes classifiers, analyzing the mathematical differences and experimental outcomes of the product-of-Bernoullis assumption versus the true categorical model.

Contribution

It provides a mathematical and experimental comparison between the product-of-Bernoullis assumption and the correct categorical Naive Bayes classifier when using one-hot encoding.

Findings

01

Classifiers often agree on the MAP label despite encoding differences.

02

Posterior probabilities tend to be higher under the PoB assumption.

03

Differences are analyzed mathematically and through experiments.

Abstract

This paper investigates the consequences of encoding a $K$ -valued categorical variable incorrectly as $K$ bits via one-hot encoding, when using a Na\"{\i}ve Bayes classifier. This gives rise to a product-of-Bernoullis (PoB) assumption, rather than the correct categorical Na\"{\i}ve Bayes classifier. The differences between the two classifiers are analysed mathematically and experimentally. In our experiments using probability vectors drawn from a Dirichlet distribution, the two classifiers are found to agree on the maximum a posteriori class label for most cases, although the posterior probabilities are usually greater for the PoB case.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsFace and Expression Recognition · Data Mining Algorithms and Applications · Fuzzy Logic and Control Systems