Universal Training of Neural Networks to Achieve Bayes Optimal   Classification Accuracy

Mohammadreza Tavasoli Naeini; Ali Bereyhi; Morteza Noshad; Ben Liang,; Alfred O. Hero III

arXiv:2501.07754·cs.LG·January 15, 2025

Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy

Mohammadreza Tavasoli Naeini, Ali Bereyhi, Morteza Noshad, Ben Liang,, Alfred O. Hero III

PDF

Open Access

TL;DR

This paper introduces BOLT, a novel loss function based on $f$-divergence, enabling neural networks to achieve Bayes optimal classification accuracy by directly targeting the Bayes error rate, with demonstrated improvements on standard datasets.

Contribution

The paper proposes a new BOLT loss derived from $f$-divergence bounds that guides models to reach Bayes optimality, a novel approach in classification training.

Findings

01

BOLT achieves comparable or better performance than cross-entropy.

02

Models trained with BOLT show improved generalization on challenging datasets.

03

The method is validated on image and text classification tasks.

Abstract

This work invokes the notion of $f$ -divergence to introduce a novel upper bound on the Bayes error rate of a general classification task. We show that the proposed bound can be computed by sampling from the output of a parameterized model. Using this practical interpretation, we introduce the Bayes optimal learning threshold (BOLT) loss whose minimization enforces a classification model to achieve the Bayes error rate. We validate the proposed loss for image and text classification tasks, considering MNIST, Fashion-MNIST, CIFAR-10, and IMDb datasets. Numerical experiments demonstrate that models trained with BOLT achieve performance on par with or exceeding that of cross-entropy, particularly on challenging datasets. This highlights the potential of BOLT in improving generalization.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications