Express Your Doubts -- Probabilistic World Modeling Should not be Based on Token logprobs
Eitan Wagner, Omri Abend

TL;DR
This paper argues that using token log probabilities from large language models for probabilistic world modeling is flawed and advocates for explicit second-order probability predictions for better theoretical soundness.
Contribution
It highlights the theoretical and practical issues of using token logprobs for world probabilities and proposes second-order prediction as a more sound alternative.
Findings
Token logprobs can lead to conflicting output distributions.
Using output probabilities as event probabilities can be misleading.
Second-order prediction offers a theoretically sound approach.
Abstract
Language modeling has shifted in recent years from a distribution over strings to prediction models with textual inputs and outputs for general-purpose tasks. This position paper highlights the often overlooked implications of this shift for the use of large language models (LLMs) as probability estimators, especially for world probabilities. In light of the theoretical distinction between distribution estimation and response prediction, we examine LLM training phases and common use cases for LLM output probabilities. We show that the different settings lead to distinct, potentially conflicting, desired output distributions. This lack of clarity leads to pitfalls when using output probabilities as event probabilities. Our position advocates for second-order prediction -- incorporating probabilities explicitly as part of the output -- as a theoretically sound method, in contrast to using…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
