AI Predicts AGI: Leveraging AGI Forecasting and Peer Review to Explore LLMs' Complex Reasoning Capabilities
Fabrizio Davide, Pietro Torre, Leonardo Ercolani, Andrea Gaggioli

TL;DR
This study evaluates the ability of large language models to forecast the emergence of AGI by 2030, using automated peer review and benchmarking, revealing diverse predictions and highlighting the need for specialized evaluation methods.
Contribution
Introduces an automated peer review process for LLM forecasts and develops an AGI-specific benchmark to assess LLMs' complex reasoning in speculative scenarios.
Findings
LLMs' AGI likelihood estimates vary widely from 3% to 47.6%.
High reliability in peer review scores with ICC=0.79.
External benchmarks show consistent LLM rankings across evaluation methods.
Abstract
We tasked 16 state-of-the-art large language models (LLMs) with estimating the likelihood of Artificial General Intelligence (AGI) emerging by 2030. To assess the quality of these forecasts, we implemented an automated peer review process (LLM-PR). The LLMs' estimates varied widely, ranging from 3% (Reka- Core) to 47.6% (GPT-4o), with a median of 12.5%. These estimates closely align with a recent expert survey that projected a 10% likelihood of AGI by 2027, underscoring the relevance of LLMs in forecasting complex, speculative scenarios. The LLM-PR process demonstrated strong reliability, evidenced by a high Intraclass Correlation Coefficient (ICC = 0.79), reflecting notable consistency in scoring across the models. Among the models, Pplx-70b-online emerged as the top performer, while Gemini-1.5-pro-api ranked the lowest. A cross-comparison with external benchmarks, such as LMSYS…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBig Data and Business Intelligence
MethodsALIGN
