# Is a Data-Driven Approach still Better than Random Choice with Naive   Bayes classifiers?

**Authors:** Piotr Szyma\'nski, Tomasz Kajdanowicz

arXiv: 1702.04013 · 2017-02-15

## TL;DR

This study compares data-driven, a priori, and random label space partitioning methods for multi-label classification using Gaussian Naive Bayes, showing data-driven methods generally outperform random approaches on benchmark datasets.

## Contribution

It provides an empirical comparison of label partitioning strategies for Naive Bayes classifiers, highlighting the conditions under which data-driven methods outperform others.

## Key findings

- Data-driven methods outperform random baselines on average.
- Data-driven approaches are more likely to outperform random methods in F1 and Subset Accuracy.
- A method exists that always beats a priori approaches in the worst case.

## Abstract

We study the performance of data-driven, a priori and random approaches to label space partitioning for multi-label classification with a Gaussian Naive Bayes classifier. Experiments were performed on 12 benchmark data sets and evaluated on 5 established measures of classification quality: micro and macro averaged F1 score, Subset Accuracy and Hamming loss. Data-driven methods are significantly better than an average run of the random baseline. In case of F1 scores and Subset Accuracy - data driven approaches were more likely to perform better than random approaches than otherwise in the worst case. There always exists a method that performs better than a priori methods in the worst case. The advantage of data-driven methods against a priori methods with a weak classifier is lesser than when tree classifiers are used.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1702.04013/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1702.04013/full.md

## References

14 references — full list in the complete paper: https://tomesphere.com/paper/1702.04013/full.md

---
Source: https://tomesphere.com/paper/1702.04013