Risk-based test framework for LLM features in regulated software

Zhiyin Zhou

arXiv:2601.17292·cs.SE·February 2, 2026

Risk-based test framework for LLM features in regulated software

Zhiyin Zhou

PDF

Open Access

TL;DR

This paper introduces a risk-based testing framework for large language model features in regulated software, addressing safety, privacy, and security concerns through a structured taxonomy and layered testing strategy.

Contribution

It presents a novel six-category risk taxonomy and a layered test strategy specifically designed for LLM features in regulated, safety-critical software environments.

Findings

01

Developed a comprehensive risk taxonomy for LLM features.

02

Designed a layered testing approach covering guardrail, orchestration, and system layers.

03

Applied the framework successfully in a clinical research platform case study.

Abstract

Large language models are increasingly embedded in regulated and safety-critical software, including clinical research platforms and healthcare information systems. While these features enable natural language search, summarization, and configuration assistance, they introduce risks such as hallucinations, harmful or out-of-scope advice, privacy and security issues, bias, instability under change, and adversarial misuse. Prior work on machine learning testing and AI assurance offers useful concepts but limited guidance for interactive, product-embedded assistants. This paper proposes a risk-based testing framework for LLM features in regulated software: a six-category risk taxonomy, a layered test strategy mapping risks to concrete tests across guardrail, orchestration, and system layers, and a case study applying the approach to a Knowledgebase assistant in a clinical research platform.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Adversarial Robustness in Machine Learning · Ethics and Social Impacts of AI