Constraining Representations Yields Models That Know What They Don't   Know

Joao Monteiro; Pau Rodriguez; Pierre-Andre Noel; Issam Laradji; David; Vazquez

arXiv:2208.14488·cs.LG·April 20, 2023

Constraining Representations Yields Models That Know What They Don't Know

Joao Monteiro, Pau Rodriguez, Pierre-Andre Noel, Issam Laradji, David, Vazquez

PDF

Open Access 1 Video

TL;DR

This paper introduces Total Activation Classifiers (TAC), a method that constrains neural network representations with class-aware codes to improve confidence estimation and detect erroneous predictions, enhancing model safety.

Contribution

The work proposes a novel class-aware activation constraint technique, TAC, applicable to various architectures, which improves confidence scoring and error detection without affecting original model accuracy.

Findings

01

TAC improves confidence estimation over baseline models.

02

TAC enhances rejection and deferment capabilities.

03

TAC performs well across multiple architectures and data types.

Abstract

A well-known failure mode of neural networks is that they may confidently return erroneous predictions. Such unsafe behaviour is particularly frequent when the use case slightly differs from the training context, and/or in the presence of an adversary. This work presents a novel direction to address these issues in a broad, general manner: imposing class-aware constraints on a model's internal activation patterns. Specifically, we assign to each class a unique, fixed, randomly-generated binary vector - hereafter called class code - and train the model so that its cross-depths activation patterns predict the appropriate class code according to the input sample's class. The resulting predictors are dubbed Total Activation Classifiers (TAC), and TACs may either be trained from scratch, or used with negligible cost as a thin add-on on top of a frozen, pre-trained neural network. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

Constraining Representations Yields Models That Know What They Don't Know· slideslive

Taxonomy

TopicsAdversarial Robustness in Machine Learning · Anomaly Detection Techniques and Applications · Explainable Artificial Intelligence (XAI)

MethodsBalanced Selection