Universal Training of Neural Networks to Achieve Bayes Optimal Classification Accuracy
Mohammadreza Tavasoli Naeini, Ali Bereyhi, Morteza Noshad, Ben Liang,, Alfred O. Hero III

TL;DR
This paper introduces BOLT, a novel loss function based on $f$-divergence, enabling neural networks to achieve Bayes optimal classification accuracy by directly targeting the Bayes error rate, with demonstrated improvements on standard datasets.
Contribution
The paper proposes a new BOLT loss derived from $f$-divergence bounds that guides models to reach Bayes optimality, a novel approach in classification training.
Findings
BOLT achieves comparable or better performance than cross-entropy.
Models trained with BOLT show improved generalization on challenging datasets.
The method is validated on image and text classification tasks.
Abstract
This work invokes the notion of -divergence to introduce a novel upper bound on the Bayes error rate of a general classification task. We show that the proposed bound can be computed by sampling from the output of a parameterized model. Using this practical interpretation, we introduce the Bayes optimal learning threshold (BOLT) loss whose minimization enforces a classification model to achieve the Bayes error rate. We validate the proposed loss for image and text classification tasks, considering MNIST, Fashion-MNIST, CIFAR-10, and IMDb datasets. Numerical experiments demonstrate that models trained with BOLT achieve performance on par with or exceeding that of cross-entropy, particularly on challenging datasets. This highlights the potential of BOLT in improving generalization.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
