The Wisdom of Deliberating AI Crowds: Does Deliberation Improve LLM-Based Forecasting?
Paul Schneider, Amalie Schramm

TL;DR
This study explores whether structured deliberation among large language models can enhance their forecasting accuracy, finding significant improvements in diverse model groups with shared information, but not in homogeneous groups or with added context.
Contribution
It demonstrates that enabling LLMs to review each other's forecasts can improve accuracy in certain scenarios, highlighting the potential of deliberation strategies for AI forecasting.
Findings
Deliberation improves accuracy in diverse models with shared information.
No benefit observed in homogeneous model groups.
Additional contextual information does not enhance forecast accuracy.
Abstract
Structured deliberation has been found to improve the performance of human forecasters. This study investigates whether a similar intervention, i.e. allowing LLMs to review each other's forecasts before updating, can improve accuracy in large language models (GPT-5, Claude Sonnet 4.5, Gemini Pro 2.5). Using 202 resolved binary questions from the Metaculus Q2 2025 AI Forecasting Tournament, accuracy was assessed across four scenarios: (1) diverse models with distributed information, (2) diverse models with shared information, (3) homogeneous models with distributed information, and (4) homogeneous models with shared information. Results show that the intervention significantly improves accuracy in scenario (2), reducing Log Loss by 0.020 or about 4 percent in relative terms (p = 0.017). However, when homogeneous groups (three instances of the same model) engaged in the same process, no…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsForecasting Techniques and Applications · Artificial Intelligence in Healthcare and Education · Explainable Artificial Intelligence (XAI)
