Posterior calibration and exploratory analysis for natural language   processing models

Khanh Nguyen; Brendan O'Connor

arXiv:1508.05154·cs.CL·September 3, 2015

Posterior calibration and exploratory analysis for natural language processing models

Khanh Nguyen, Brendan O'Connor

PDF

TL;DR

This paper emphasizes the importance of evaluating the calibration of probabilistic NLP models and introduces methods for assessing and improving their uncertainty estimates, enhancing trustworthiness in NLP applications.

Contribution

It presents a novel approach to analyze model calibration in NLP and introduces a coreference sampling algorithm for confidence interval estimation in event extraction.

Findings

01

Many NLP models are miscalibrated, affecting trust in their predictions.

02

The proposed calibration analysis method effectively compares model uncertainties.

03

The coreference sampling algorithm provides reliable confidence intervals for event extraction.

Abstract

Many models in natural language processing define probabilistic distributions over linguistic structures. We argue that (1) the quality of a model' s posterior distribution can and should be directly evaluated, as to whether probabilities correspond to empirical frequencies, and (2) NLP uncertainty can be projected not only to pipeline components, but also to exploratory data analysis, telling a user when to trust and not trust the NLP analysis. We present a method to analyze calibration, and apply it to compare the miscalibration of several commonly used models. We also contribute a coreference sampling algorithm that can create confidence intervals for a political event extraction task.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.