Ensemble based approach to quantifying uncertainty of LLM based   classifications

Srijith Rajamohan; Ahmed Salhin; Josh Frazier; Rohit Kumar; Yu-Cheng; Tsai; Todd Cook

arXiv:2502.08631·cs.AI·February 20, 2025

Ensemble based approach to quantifying uncertainty of LLM based classifications

Srijith Rajamohan, Ahmed Salhin, Josh Frazier, Rohit Kumar, Yu-Cheng, Tsai, Todd Cook

PDF

Open Access

TL;DR

This paper proposes an ensemble-based probabilistic method to quantify uncertainty in LLM classifications, linking output variance to model certainty and input lexical variance, and demonstrates how fine-tuning reduces sensitivity to input variations.

Contribution

It introduces a novel ensemble approach that estimates classification certainty in LLMs by analyzing output variance related to model knowledge and input lexical diversity.

Findings

01

Fine-tuning reduces sensitivity to input lexical variations.

02

The method effectively estimates classification certainty.

03

Output variance correlates with model confidence.

Abstract

The output of Large Language Models (LLMs) are a function of the internal model's parameters and the input provided into the context window. The hypothesis presented here is that under a greedy sampling strategy the variance in the LLM's output is a function of the conceptual certainty embedded in the model's parametric knowledge, as well as the lexical variance in the input. Finetuning the model results in reducing the sensitivity of the model output to the lexical input variations. This is then applied to a classification problem and a probabilistic method is proposed for estimating the certainties of the predicted classes.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical and Computational Modeling