Engineering Robustness into Personal Agents with the AI Workflow Store
Roxana Geambasu, Mariana Raykova, Pierre Tholoniat, Trishita Tiwari, Lillian Tsai, Wen Zhang

TL;DR
This paper advocates for integrating disciplined software engineering processes into AI agent workflows to enhance reliability, security, and robustness, moving beyond the current rapid, on-the-fly synthesis paradigm.
Contribution
It proposes the concept of an AI Workflow Store with reusable, rigorously tested workflows to improve agent robustness and security in high-stakes applications.
Findings
Highlights the limitations of on-the-fly AI synthesis for critical applications.
Proposes a framework for reusable, hardened AI workflows.
Discusses research challenges in balancing flexibility and robustness.
Abstract
The dominant paradigm for AI agents is an "on-the-fly" loop in which agents synthesize plans and execute actions within seconds or minutes in response to user prompts. We argue that this paradigm short-circuits disciplined software engineering (SE) processes -- iterative design, rigorous testing, adversarial evaluation, staged deployment, and more -- that have delivered the (relatively) reliable and secure systems we use today. By focusing on rapid, real-time synthesis, are AI agents effectively delivering users improvised prototypes rather than systems fit for high-stakes scenarios in which users may unwittingly apply them? This paper argues for the need to integrate rigorous SE processes into the agentic loop to produce production-grade, hardened, and deterministically-constrained agent *workflows* that substantially outperform the potentially brittle and vulnerable results of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
