Calibration of Natural Language Understanding Models with Venn--ABERS Predictors
Patrizio Giovannotti

TL;DR
This paper introduces inductive Venn--ABERS predictors for transformer-based NLU models, ensuring well-calibrated probabilistic outputs without sacrificing accuracy across diverse tasks.
Contribution
It presents a novel calibration method for transformers using IVAP, guaranteeing calibration under minimal assumptions and maintaining predictive accuracy.
Findings
IVAP produces well-calibrated probabilities across tasks
Predictions are uniformly spread over [0,1] interval
Calibration does not reduce original model accuracy
Abstract
Transformers, currently the state-of-the-art in natural language understanding (NLU) tasks, are prone to generate uncalibrated predictions or extreme probabilities, making the process of taking different decisions based on their output relatively difficult. In this paper we propose to build several inductive Venn--ABERS predictors (IVAP), which are guaranteed to be well calibrated under minimal assumptions, based on a selection of pre-trained transformers. We test their performance over a set of diverse NLU tasks and show that they are capable of producing well-calibrated probabilistic predictions that are uniformly spread over the [0,1] interval -- all while retaining the original model's predictive accuracy.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Speech Recognition and Synthesis
