On Uncertainty In Natural Language Processing

Dennis Ulmer

arXiv:2410.03446·cs.AI·October 23, 2024

On Uncertainty In Natural Language Processing

Dennis Ulmer

PDF

Open Access 1 Repo

TL;DR

This paper explores how uncertainty in natural language processing can be characterized, quantified, and reduced through theoretical analysis, empirical experiments, and novel methods, enhancing the reliability of NLP models.

Contribution

It introduces new approaches for uncertainty quantification, including calibrated sampling for language generation and confidence estimation for black-box models, with extensive multilingual experiments.

Findings

01

Uncertainty can be effectively characterized from linguistic, statistical, and neural perspectives.

02

Proposed a calibrated sampling method that improves coverage in language generation.

03

Developed a confidence prediction approach for black-box language models.

Abstract

The last decade in deep learning has brought on increasingly capable systems that are deployed on a wide variety of applications. In natural language processing, the field has been transformed by a number of breakthroughs including large language models, which are used in increasingly many user-facing applications. In order to reap the benefits of this technology and reduce potential harms, it is important to quantify the reliability of model predictions and the uncertainties that shroud their development. This thesis studies how uncertainty in natural language processing can be characterized from a linguistic, statistical and neural perspective, and how it can be reduced and quantified through the design of the experimental pipeline. We further explore uncertainty quantification in modeling by theoretically and empirically investigating the effect of inductive model biases in text…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kaleidophon/nlp-uncertainty-zoo
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · AI-based Problem Solving and Planning

MethodsSparse Evolutionary Training