From Stochasticity to Signal: A Bayesian Latent State Model for Reliable Measurement with LLMs

Yichi Zhang; Ignacio Martinez

arXiv:2510.23874·stat.ME·April 24, 2026

From Stochasticity to Signal: A Bayesian Latent State Model for Reliable Measurement with LLMs

Yichi Zhang, Ignacio Martinez

PDF

TL;DR

This paper introduces a Bayesian latent state model to quantify and improve the reliability of LLM-based classifications, addressing stochasticity-induced measurement errors in business and scientific contexts.

Contribution

It presents a formal Bayesian framework that jointly estimates error rates, true outcome probabilities, and intervention effects, applicable in semi-supervised and unsupervised settings.

Findings

01

Model accurately recovers true parameters in simulations.

02

Outperforms existing methods in estimating population metrics.

03

Provides reliable insights from LLM outputs in real-world case study.

Abstract

Large Language Models (LLMs) are increasingly used to automate classification tasks in business, such as analyzing customer satisfaction from text. However, the inherent stochasticity of LLMs can create measurement error when the outcome is considered deterministic. This problem is often neglected with the empirical practice of a single round of output, or addressed with ad-hoc methods like majority voting. Such naive approaches fail to quantify uncertainty and can produce biased estimates of population-level metrics. In this paper, we propose a formal statistical solution by introducing a Bayesian latent state model to address it. Our model treats the true classification as a latent variable and the multiple LLM ratings as noisy measurements of this outcome state. This framework jointly estimates LLM error rates, population-level outcome rates, individual-level probabilities of the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.