Future Is Unevenly Distributed: Forecasting Ability of LLMs Depends on What We're Asking
Chinmay Karkar, Paras Chopra

TL;DR
This paper examines how the forecasting ability of large language models varies significantly based on question framing, context, and external knowledge, highlighting the importance of prompt design.
Contribution
It systematically analyzes the factors influencing LLM forecasting performance across domains and introduces insights into how question framing impacts accuracy and reliability.
Findings
Forecasting ability varies with question type and context.
Adding factual news context influences belief formation.
Model performance depends heavily on prompt design.
Abstract
Large Language Models (LLMs) demonstrate partial forecasting competence across social, political, and economic events. Yet, their predictive ability varies sharply with domain structure and prompt framing. We investigate how forecasting performance varies with different model families on real-world questions about events that happened beyond the model cutoff date. We analyze how context, question type, and external knowledge affect accuracy and calibration, and how adding factual news context modifies belief formation and failure modes. Our results show that forecasting ability is highly variable as it depends on what, and how, we ask.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsComputational and Text Analysis Methods · Forecasting Techniques and Applications · Topic Modeling
