Future-as-Label: Scalable Supervision from Real-World Outcomes
Benjamin Turtel, Paul Wilczewski, Danny Franklin, Kris Skothiem

TL;DR
This paper introduces a scalable supervision method called Future-as-Label that leverages real-world outcomes as labels for training language models to improve probabilistic forecasting without manual annotation.
Contribution
It extends reinforcement learning with verifiable rewards to real-world prediction, enabling outcome-driven training using only realized outcomes as supervision.
Findings
Qwen3-32B with Foresight Learning improves Brier score by 27%.
Halves calibration error compared to pretrained baseline.
Outperforms larger models on forecasting benchmarks despite fewer parameters.
Abstract
Time creates free supervision: forecasts about real-world events resolve to verifiable outcomes. The passage of time provides labels that require no annotation. To exploit this structure, we extend reinforcement learning with verifiable rewards to real-world prediction over time. We train language models to make probabilistic forecasts from causally masked information, using proper scoring rules as the reward function once events resolve. Learning is driven entirely by realized outcomes, enabling scalable outcome-based supervision in open-world prediction. On real-world forecasting benchmarks, Qwen3-32B trained using Foresight Learning improves Brier score by 27% and halves calibration error relative to its pretrained baseline, and outperforms Qwen3-235B on both constructed future-event prediction tasks and the Metaculus benchmark despite a 7x parameter disadvantage.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsArtificial Intelligence in Healthcare and Education · Machine Learning in Healthcare · Topic Modeling
