Agentic Forecasting using Sequential Bayesian Updating of Linguistic Beliefs

Kevin Murphy

arXiv:2604.18576·cs.AI·May 5, 2026

Agentic Forecasting using Sequential Bayesian Updating of Linguistic Beliefs

Kevin Murphy

PDF

TL;DR

The paper introduces the Bayesian Linguistic Forecaster (BLF), a novel agentic system that uses hierarchical Bayesian methods and natural language evidence to achieve state-of-the-art binary forecasting performance on the ForecastBench benchmark.

Contribution

It presents a new system combining linguistic belief states, hierarchical aggregation, and calibration, outperforming existing methods on a standard benchmark.

Findings

01

BLF outperforms all top public methods on ForecastBench.

02

All three proposed components significantly contribute to performance gains.

03

The developed back-testing framework has a leakage rate below 1.5%.

Abstract

We present the Bayesian Linguistic Forecaster (BLF), an agentic system for binary forecasting that achieves state-of-the-art performance on the ForecastBench benchmark. The system is built on three ideas. (1) Linguistic belief state: a semi-structured representation combining numerical probability estimates with natural-language evidence summaries, updated by the LLM at each step of an iterative tool-use loop. This contrasts with the common approach of appending all retrieved evidence to an ever-growing, unstructured context. (2) Hierarchical multi-trial aggregation: running $K$ independent trials and combining them using logit-space averaging shrinkage with a data-dependent prior. (3) Hierarchical calibration: Platt scaling with a hierarchical prior, which avoids over-shrinking extreme predictions for sources with skewed base rates. On 400 questions from the ForecastBench leaderboard,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.