Decoupling Feature Extraction and Classification Layers for Calibrated   Neural Networks

Mikkel Jordahn; Pablo M. Olmos

arXiv:2405.01196·cs.LG·May 7, 2024

Decoupling Feature Extraction and Classification Layers for Calibrated Neural Networks

Mikkel Jordahn, Pablo M. Olmos

PDF

Open Access

TL;DR

This paper proposes decoupling feature extraction and classification training in over-parametrized DNNs to improve calibration without sacrificing accuracy, and introduces a variational approach with a Gaussian prior for further calibration gains.

Contribution

It introduces a novel decoupling training method for DNNs and a variational approach with Gaussian prior to enhance calibration while maintaining accuracy.

Findings

01

Decoupling training improves calibration in WRN and ViT models.

02

Gaussian prior with variational training further enhances calibration.

03

Methods retain accuracy across multiple image classification benchmarks.

Abstract

Deep Neural Networks (DNN) have shown great promise in many classification applications, yet are widely known to have poorly calibrated predictions when they are over-parametrized. Improving DNN calibration without comprising on model accuracy is of extreme importance and interest in safety critical applications such as in the health-care sector. In this work, we show that decoupling the training of feature extraction layers and classification layers in over-parametrized DNN architectures such as Wide Residual Networks (WRN) and Visual Transformers (ViT) significantly improves model calibration whilst retaining accuracy, and at a low training cost. In addition, we show that placing a Gaussian prior on the last hidden layer outputs of a DNN, and training the model variationally in the classification training stage, even further improves calibration. We illustrate these methods improve…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications