Making Heads and Tails of Models with Marginal Calibration for Sparse   Tagsets

Michael Kranzlein; Nelson F. Liu; Nathan Schneider

arXiv:2109.07494·cs.CL·May 18, 2023

Making Heads and Tails of Models with Marginal Calibration for Sparse Tagsets

Michael Kranzlein, Nelson F. Liu, Nathan Schneider

PDF

Open Access 1 Repo

TL;DR

This paper investigates calibration of probabilistic tagging models with sparse tagsets, proposing methods to measure and improve calibration accuracy across different tag frequency groups, enhancing model reliability.

Contribution

It introduces tag frequency grouping (TFG) for measuring calibration error and demonstrates effective recalibration strategies for sequence taggers with sparse tagsets.

Findings

01

Post-hoc recalibration reduces calibration error.

02

TFG effectively measures calibration across frequency bands.

03

Separate group recalibration improves calibration equity.

Abstract

For interpreting the behavior of a probabilistic model, it is useful to measure a model's calibration--the extent to which it produces reliable confidence scores. We address the open problem of calibration for tagging models with sparse tagsets, and recommend strategies to measure and reduce calibration error (CE) in such models. We show that several post-hoc recalibration techniques all reduce calibration error across the marginal distribution for two existing sequence taggers. Moreover, we propose tag frequency grouping (TFG) as a way to measure calibration error in different frequency bands. Further, recalibrating each group separately promotes a more equitable reduction of calibration error across the tag frequency spectrum.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nert-nlp/calibration_tfg
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Time Series Analysis and Forecasting · Anomaly Detection Techniques and Applications