I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token

Roi Cohen; Konstantin Dobler; Eden Biran; Gerard de Melo

arXiv:2412.06676·cs.LG·December 10, 2024

I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token

Roi Cohen, Konstantin Dobler, Eden Biran, Gerard de Melo

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel calibration method for large language models that incorporates an [IDK] token to explicitly express uncertainty, reducing hallucinations and incorrect outputs with minimal knowledge loss.

Contribution

The authors propose adding an [IDK] token and an objective function to improve model calibration and uncertainty estimation in large language models.

Findings

01

Models with [IDK] token better express uncertainty.

02

Reduced hallucinations in factual tasks.

03

Minimal loss of factual knowledge.

Abstract

Large Language Models are known to capture real-world knowledge, allowing them to excel in many downstream tasks. Despite recent advances, these models are still prone to what are commonly known as hallucinations, causing them to emit unwanted and factually incorrect text. In this work, we propose a novel calibration method that can be used to combat hallucinations. We add a special [IDK] ("I don't know") token to the model's vocabulary and introduce an objective function that shifts probability mass to the [IDK] token for incorrect predictions. This approach allows the model to express uncertainty in its output explicitly. We evaluate our proposed method across multiple model architectures and factual downstream tasks. We find that models trained with our method are able to express uncertainty in places where they would previously make mistakes while suffering only a small loss of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

roi-hpi/IDK-token-tuning
pytorchOfficial

Videos

I Don't Know: Explicit Modeling of Uncertainty with an [IDK] Token· slideslive

Taxonomy

TopicsSimulation Techniques and Applications