Deep Probabilistic Supervision for Image Classification
Anton Adel\"ow, Matteo Gamba, Atsuto Maki

TL;DR
Deep Probabilistic Supervision (DPS) introduces a framework that constructs sample-specific target distributions for image classification, improving accuracy, calibration, and robustness by modeling predictive uncertainty explicitly.
Contribution
DPS is a novel learning framework that builds sample-specific target distributions through statistical inference, reducing reliance on hard targets and enhancing model calibration and robustness.
Findings
DPS improves test accuracy by up to 2% on ImageNet.
DPS significantly reduces Expected Calibration Error (ECE).
Combining DPS with contrastive loss enhances robustness under label noise.
Abstract
Supervised training of deep neural networks for classification typically relies on hard targets, which promote overconfidence and can limit calibration, generalization, and robustness. Self-distillation methods aim to mitigate this by leveraging inter-class and sample-specific information present in the model's own predictions, but often remain dependent on hard targets without explicitly modeling predictive uncertainty. With this in mind, we propose Deep Probabilistic Supervision (DPS), a principled learning framework constructing sample-specific target distributions via statistical inference on the model's own predictions, remaining independent of hard targets after initialization. We show that DPS consistently yields higher test accuracy (e.g., +2.0% for DenseNet-264 on ImageNet) and significantly lower Expected Calibration Error (ECE) (-40% ResNet-50, CIFAR-100) than existing…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning and Data Classification · Advanced Neural Network Applications · Adversarial Robustness in Machine Learning
