Improving spam filtering by combining Naive Bayes with simple k-nearest   neighbor searches

Daniel Etzold

arXiv:cs/0312004·cs.LG·May 23, 2007·3 cites

Improving spam filtering by combining Naive Bayes with simple k-nearest neighbor searches

Daniel Etzold

PDF

Open Access

TL;DR

This paper explores combining naive Bayes with k-nearest neighbor searches to improve email spam filtering accuracy, especially with fewer features, demonstrating empirical performance gains.

Contribution

It introduces a novel hybrid approach that enhances naive Bayes spam filtering by integrating simple k-nearest neighbor searches, showing improved accuracy with fewer features.

Findings

01

Improved accuracy with fewer features using the hybrid method

02

Slight accuracy gains for high feature counts

03

Significant accuracy improvements for low feature counts

Abstract

Using naive Bayes for email classification has become very popular within the last few months. They are quite easy to implement and very efficient. In this paper we want to present empirical results of email classification using a combination of naive Bayes and k-nearest neighbor searches. Using this technique we show that the accuracy of a Bayes filter can be improved slightly for a high number of features and significantly for a small number of features.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpam and Phishing Detection · Text and Document Classification Technologies · Data Management and Algorithms