Unseen Risks of Clinical Speech-to-Text Systems: Transparency, Privacy, and Reliability Challenges in AI-Driven Documentation
Nelly Elsayed

TL;DR
This paper identifies socio-technical risks in clinical speech-to-text systems and proposes a layered governance framework to ensure responsible deployment emphasizing transparency and accountability.
Contribution
It develops a comprehensive socio-technical risk framework and governance model for clinical speech-to-text systems, integrating interdisciplinary evidence and practical implementation guidance.
Findings
Risks include consent inconsistencies, performance disparities, and accountability issues.
Clinical STT systems are affected by audio conditions, clinician oversight, and organizational factors.
A six-layer governance model addresses technical, ethical, organizational, and sociocultural risks.
Abstract
AI-driven speech-to-text (STT) documentation systems are increasingly adopted in clinical settings to reduce documentation burden and improve workflow efficiency. However, adoption has outpaced systematic evaluation of socio-technical risks related to transparency, reliability, patient autonomy, and organizational accountability. This study develops a socio-technical framework for identifying and governing risks associated with clinical STT systems. We synthesize interdisciplinary evidence from automatic speech recognition research, clinical workflow and human factors studies, ethical guidance on consent and autonomy, and regulatory and organizational sources. Using a structured narrative synthesis, literature was iteratively reviewed and thematically analyzed to identify recurring socio-technical risk mechanisms and inform a layered conceptual framework. Findings show that clinical STT…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
