Is there a half-life for the success rates of AI agents?
Toby Ord

TL;DR
This paper demonstrates that AI agent success rates on longer tasks decline exponentially, characterized by a half-life, and presents a simple mathematical model explaining this phenomenon based on failure rates per minute.
Contribution
It introduces a straightforward exponential decay model for AI performance on extended tasks, linking failure rates to task length and providing a new way to characterize agent robustness.
Findings
Success rates decline exponentially with task length.
Each agent has a specific half-life indicating performance decay.
Model fits well with empirical data from research-engineering tasks.
Abstract
Building on the recent empirical work of Kwa et al. (2025), I show that within their suite of research-engineering tasks the performance of AI agents on longer-duration tasks can be explained by an extremely simple mathematical model -- a constant rate of failing during each minute a human would take to do the task. This implies an exponentially declining success rate with the length of the task and that each agent could be characterised by its own half-life. This empirical regularity allows us to estimate the success rate for an agent at different task lengths. And the fact that this model is a good fit for the data is suggestive of the underlying causes of failure on longer tasks -- that they involve increasingly large sets of subtasks where failing any one fails the task. Whether this model applies more generally on other suites of tasks is unknown and an important subject for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEthics and Social Impacts of AI · Reinforcement Learning in Robotics · Mobile Crowdsensing and Crowdsourcing
