Validity Is What You Need

Sebastian Benthall; Andrew Clark

arXiv:2510.27628·cs.AI·November 3, 2025

Validity Is What You Need

Sebastian Benthall, Andrew Clark

PDF

Open Access

TL;DR

This paper redefines Agentic AI as autonomous software applications in enterprise settings, emphasizing the importance of validation for their success and suggesting simpler models can sometimes replace complex foundation models.

Contribution

It introduces a new realist definition of Agentic AI as autonomous enterprise applications and highlights the critical role of validation over foundation models.

Findings

01

Agentic AI is best understood as autonomous enterprise applications.

02

Validation by stakeholders is crucial for Agentic AI success.

03

Simpler models can often replace complex foundation models when validated effectively.

Abstract

While AI agents have long been discussed and studied in computer science, today's Agentic AI systems are something new. We consider other definitions of Agentic AI and propose a new realist definition. Agentic AI is a software delivery mechanism, comparable to software as a service (SaaS), which puts an application to work autonomously in a complex enterprise setting. Recent advances in large language models (LLMs) as foundation models have driven excitement in Agentic AI. We note, however, that Agentic AI systems are primarily applications, not foundations, and so their success depends on validation by end users and principal stakeholders. The tools and techniques needed by the principal users to validate their applications are quite different from the tools and techniques used to evaluate foundation models. Ironically, with good validation measures in place, in many cases the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMulti-Agent Systems and Negotiation · Artificial Intelligence in Law · Explainable Artificial Intelligence (XAI)