Decoupling Feature Extraction and Classification Layers for Calibrated Neural Networks
Mikkel Jordahn, Pablo M. Olmos

TL;DR
This paper proposes decoupling feature extraction and classification training in over-parametrized DNNs to improve calibration without sacrificing accuracy, and introduces a variational approach with a Gaussian prior for further calibration gains.
Contribution
It introduces a novel decoupling training method for DNNs and a variational approach with Gaussian prior to enhance calibration while maintaining accuracy.
Findings
Decoupling training improves calibration in WRN and ViT models.
Gaussian prior with variational training further enhances calibration.
Methods retain accuracy across multiple image classification benchmarks.
Abstract
Deep Neural Networks (DNN) have shown great promise in many classification applications, yet are widely known to have poorly calibrated predictions when they are over-parametrized. Improving DNN calibration without comprising on model accuracy is of extreme importance and interest in safety critical applications such as in the health-care sector. In this work, we show that decoupling the training of feature extraction layers and classification layers in over-parametrized DNN architectures such as Wide Residual Networks (WRN) and Visual Transformers (ViT) significantly improves model calibration whilst retaining accuracy, and at a low training cost. In addition, we show that placing a Gaussian prior on the last hidden layer outputs of a DNN, and training the model variationally in the classification training stage, even further improves calibration. We illustrate these methods improve…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNeural Networks and Applications
