Eliciting Numerical Predictive Distributions of LLMs Without Autoregression

Julianna Piskorz; Katarzyna Kobalczyk; Mihaela van der Schaar

arXiv:2603.02913·cs.LG·March 4, 2026

Eliciting Numerical Predictive Distributions of LLMs Without Autoregression

Julianna Piskorz, Katarzyna Kobalczyk, Mihaela van der Schaar

PDF

Open Access 1 Datasets

TL;DR

This paper explores methods to extract numerical predictive distributions from Large Language Models without autoregressive sampling, by analyzing internal representations to estimate distributional statistics efficiently.

Contribution

It introduces regression probes that predict statistical functionals from LLM embeddings, revealing that models encode useful uncertainty information internally.

Findings

01

LLM embeddings contain signals about predictive distribution statistics

02

Probes can estimate means, medians, and quantiles without sampling

03

Potential for more efficient uncertainty estimation in LLMs

Abstract

Large Language Models (LLMs) have recently been successfully applied to regression tasks -- such as time series forecasting and tabular prediction -- by leveraging their in-context learning abilities. However, their autoregressive decoding process may be ill-suited to continuous-valued outputs, where obtaining predictive distributions over numerical targets requires repeated sampling, leading to high computational cost and inference time. In this work, we investigate whether distributional properties of LLM predictions can be recovered without explicit autoregressive generation. To this end, we study a set of regression probes trained to predict statistical functionals (e.g., mean, median, quantiles) of the LLM's numerical output distribution directly from its internal representations. Our results suggest that LLM embeddings carry informative signals about summary statistics of their…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Datasets

jpiskorz/GuessLLM
dataset· 40 dl
40 dl

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI) · Topic Modeling