Variational Classification
Shehzaad Dhuliawala, Mrinmaya Sachan, Carl Allen

TL;DR
This paper introduces a variational approach to classification that offers a new probabilistic interpretation of softmax classifiers, improving calibration and robustness without sacrificing accuracy.
Contribution
It develops a novel variational framework for softmax classifiers, addressing distributional inconsistencies and enhancing properties like calibration and robustness.
Findings
Maintains classification accuracy
Improves calibration and robustness
Enhances sample efficiency in low-data regimes
Abstract
We present a latent variable model for classification that provides a novel probabilistic interpretation of neural network softmax classifiers. We derive a variational objective to train the model, analogous to the evidence lower bound (ELBO) used to train variational auto-encoders, that generalises the softmax cross-entropy loss. Treating inputs to the softmax layer as samples of a latent variable, our abstracted perspective reveals a potential inconsistency between their anticipated distribution, required for accurate label predictions, and their empirical distribution found in practice. We augment the variational objective to mitigate such inconsistency and induce a chosen latent distribution, instead of the implicit assumption found in a standard softmax layer. Overall, we provide new theoretical insight into the inner workings of widely-used softmax classifiers. Empirical…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenerative Adversarial Networks and Image Synthesis · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
MethodsSoftmax
