Ensemble based approach to quantifying uncertainty of LLM based classifications
Srijith Rajamohan, Ahmed Salhin, Josh Frazier, Rohit Kumar, Yu-Cheng, Tsai, Todd Cook

TL;DR
This paper proposes an ensemble-based probabilistic method to quantify uncertainty in LLM classifications, linking output variance to model certainty and input lexical variance, and demonstrates how fine-tuning reduces sensitivity to input variations.
Contribution
It introduces a novel ensemble approach that estimates classification certainty in LLMs by analyzing output variance related to model knowledge and input lexical diversity.
Findings
Fine-tuning reduces sensitivity to input lexical variations.
The method effectively estimates classification certainty.
Output variance correlates with model confidence.
Abstract
The output of Large Language Models (LLMs) are a function of the internal model's parameters and the input provided into the context window. The hypothesis presented here is that under a greedy sampling strategy the variance in the LLM's output is a function of the conceptual certainty embedded in the model's parametric knowledge, as well as the lexical variance in the input. Finetuning the model results in reducing the sensitivity of the model output to the lexical input variations. This is then applied to a classification problem and a probabilistic method is proposed for estimating the certainties of the predicted classes.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsStatistical and Computational Modeling
