Can LLMs Perceive Time? An Empirical Investigation
Aniketh Garikaparthi

TL;DR
This paper empirically demonstrates that large language models significantly overestimate task durations, struggle with relative timing judgments, and lack experiential awareness of their own inference times, impacting practical applications.
Contribution
It provides the first comprehensive empirical analysis of LLMs' perception of time, revealing systematic overestimation and failure to accurately judge durations and relative timings.
Findings
Models overestimate task durations by 4-7 times.
Models perform at or below chance on timing-based pair comparisons.
Estimates diverge from actual durations by an order of magnitude.
Abstract
Large language models cannot estimate how long their own tasks take. We investigate this limitation through four experiments across 68 tasks and four model families. Pre-task estimates overshoot actual duration by 4--7 (), with models predicting human-scale minutes for tasks completing in seconds. Relative ordering fares no better: on task pairs designed to expose heuristic reliance, models score at or below chance (GPT-5: 18\% on counter-intuitive pairs, ), systematically failing when complexity labels mislead. Post-hoc recall is disconnected from reality -- estimates diverge from actuals by an order of magnitude in either direction. These failures persist in multi-step agentic settings, with errors of 5--10. The models possess propositional knowledge about duration from training but lack experiential grounding in their own inference time, with…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
