Engineering Robustness into Personal Agents with the AI Workflow Store

Roxana Geambasu; Mariana Raykova; Pierre Tholoniat; Trishita Tiwari; Lillian Tsai; Wen Zhang

arXiv:2605.10907·cs.CR·May 13, 2026

Engineering Robustness into Personal Agents with the AI Workflow Store

Roxana Geambasu, Mariana Raykova, Pierre Tholoniat, Trishita Tiwari, Lillian Tsai, Wen Zhang

PDF

TL;DR

This paper advocates for integrating disciplined software engineering processes into AI agent workflows to enhance reliability, security, and robustness, moving beyond the current rapid, on-the-fly synthesis paradigm.

Contribution

It proposes the concept of an AI Workflow Store with reusable, rigorously tested workflows to improve agent robustness and security in high-stakes applications.

Findings

01

Highlights the limitations of on-the-fly AI synthesis for critical applications.

02

Proposes a framework for reusable, hardened AI workflows.

03

Discusses research challenges in balancing flexibility and robustness.

Abstract

The dominant paradigm for AI agents is an "on-the-fly" loop in which agents synthesize plans and execute actions within seconds or minutes in response to user prompts. We argue that this paradigm short-circuits disciplined software engineering (SE) processes -- iterative design, rigorous testing, adversarial evaluation, staged deployment, and more -- that have delivered the (relatively) reliable and secure systems we use today. By focusing on rapid, real-time synthesis, are AI agents effectively delivering users improvised prototypes rather than systems fit for high-stakes scenarios in which users may unwittingly apply them? This paper argues for the need to integrate rigorous SE processes into the agentic loop to produce production-grade, hardened, and deterministically-constrained agent *workflows* that substantially outperform the potentially brittle and vulnerable results of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.