A Clinical Trial Design Approach to Auditing Language Models in   Healthcare Setting

Lovedeep Gondara; Jonathan Simkin

arXiv:2411.16702·cs.CY·December 20, 2024

A Clinical Trial Design Approach to Auditing Language Models in Healthcare Setting

Lovedeep Gondara, Jonathan Simkin

PDF

Open Access

TL;DR

This paper introduces a clinical trial-inspired audit mechanism for evaluating healthcare language models, ensuring statistical rigor and minimal sample use, demonstrated through a real-world public health example.

Contribution

It proposes a novel audit framework based on clinical trial design principles for assessing healthcare language models, emphasizing sample efficiency and statistical validity.

Findings

01

Effective sample size calculation for audits

02

Maintains audit integrity with minimal data

03

Validated in a large-scale public health setting

Abstract

We present an audit mechanism for language models, with a focus on models deployed in the healthcare setting. Our proposed mechanism takes inspiration from clinical trial design where we posit the language model audit as a single blind equivalence trial, with the comparison of interest being the subject matter experts. We show that using our proposed method, we can follow principled sample size and power calculations, leading to the requirement of sampling minimum number of records while maintaining the audit integrity and statistical soundness. Finally, we provide a real-world example of the audit used in a production environment in a large-scale public health network.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsBiomedical Text Mining and Ontologies · Electronic Health Records Systems · Clinical practice guidelines implementation

MethodsFocus