Estimating Continuous Distributions in Bayesian Classifiers

George H. John; Pat Langley

arXiv:1302.4964·cs.LG·February 21, 2013·2.9k cites

Estimating Continuous Distributions in Bayesian Classifiers

George H. John, Pat Langley

PDF

Open Access

TL;DR

This paper explores nonparametric kernel density estimation for continuous variables in Bayesian classifiers, demonstrating significant error reductions over Gaussian assumptions across various datasets.

Contribution

It introduces the use of kernel density estimation in Bayesian classifiers, moving beyond normality assumptions for improved modeling of continuous variables.

Findings

01

Kernel density estimation reduces classification error.

02

Nonparametric methods outperform Gaussian assumptions.

03

Improved accuracy on natural and artificial datasets.

Abstract

When modeling a probability distribution with a Bayesian network, we are faced with the problem of how to handle continuous variables. Most previous work has either solved the problem by discretizing, or assumed that the data are generated by a single Gaussian. In this paper we abandon the normality assumption and instead use statistical methods for nonparametric density estimation. For a naive Bayesian classifier, we present experimental results on a variety of natural and artificial domains, comparing two methods of density estimation: assuming normality and modeling each conditional distribution with a single Gaussian; and using nonparametric kernel density estimation. We observe large reductions in error on several natural and artificial data sets, which suggests that kernel estimation is a useful tool for learning Bayesian models.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBayesian Modeling and Causal Inference · Bayesian Methods and Mixture Models · Gaussian Processes and Bayesian Inference