TL;DR
This paper introduces Fermi Problems as a new reasoning challenge for AI, providing datasets of real and synthetic questions with solutions, revealing current models' struggles with approximate estimation tasks.
Contribution
It presents the first datasets and benchmarks for Fermi Problems, aiming to advance AI's reasoning abilities through a novel challenge.
Findings
Large language models perform poorly on Fermi Problems, with estimates off by two orders of magnitude.
The datasets include detailed solutions with executable programs and supporting facts.
Fermi Problems pose a significant challenge for current AI systems, highlighting areas for future research.
Abstract
Many real-world problems require the combined application of multiple reasoning abilities employing suitable abstractions, commonsense knowledge, and creative synthesis of problem-solving strategies. To help advance AI systems towards such capabilities, we propose a new reasoning challenge, namely Fermi Problems (FPs), which are questions whose answers can only be approximately estimated because their precise computation is either impractical or impossible. For example, "How much would the sea level rise if all ice in the world melted?" FPs are commonly used in quizzes and interviews to bring out and evaluate the creative reasoning abilities of humans. To do the same for AI systems, we present two datasets: 1) A collection of 1k real-world FPs sourced from quizzes and olympiads; and 2) a bank of 10k synthetic FPs of intermediate complexity to serve as a sandbox for the harder real-world…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
