PRISM: Prompt Reliability via Iterative Simulation and Monitoring for Enterprise Conversational AI

Keshava Chaitanya; Jahnavi Gundakaram

arXiv:2605.15665·cs.AI·May 18, 2026

PRISM: Prompt Reliability via Iterative Simulation and Monitoring for Enterprise Conversational AI

Keshava Chaitanya, Jahnavi Gundakaram

PDF

TL;DR

PRISM is a framework that continuously tests, diagnoses, and repairs prompts for enterprise conversational AI, ensuring high reliability despite LLM behavioral drift over time.

Contribution

PRISM introduces a closed-loop, iterative approach to prompt engineering that automates testing, diagnosis, and repair for maintaining prompt reliability in production environments.

Findings

01

Reduces prompt authoring time from 2 days to under 30 minutes

02

Achieves 99% reliability across enterprise agents

03

Detects and repairs regressions within 24 hours

Abstract

Deploying large language model (LLM)-driven conversational agents in enterprise settings requires prompts that are simultaneously correct at launch and resilient to the non-deterministic behavioral drift that characterizes production LLM deployments. Existing prompt optimization frameworks address prompt quality as a one-time compile-time problem, leaving open the equally critical question of how to detect and repair prompt regressions caused by silent LLM behavior changes over time. We present PRISM (Prompt Reliability via Iterative Simulation and Monitoring), a closed-loop framework that treats prompt engineering as a continuous reliability engineering problem rather than a one-time authorship task. PRISM takes as input plain-language agent requirements, a set of configured tools and memory variables, and an initial draft prompt. It automatically generates test cases from…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.