Navigating the Reality Gap: Privacy-Preserving On-Device Continual Adaptation of ASR for Clinical Telephony
Darshil Chauhan, Adityasinh Solanki, Vansh Patel, Kanav Kapoor, Ritvik Jain, Aditya Bansal, Pratik Narang, Dhruv Kumar

TL;DR
This paper addresses the challenge of adapting speech recognition models for noisy clinical telephony environments while preserving patient privacy, by developing an on-device continual adaptation framework that significantly improves accuracy.
Contribution
It introduces a privacy-preserving on-device continual adaptation method using Low-Rank Adaptation and Experience Replay to improve clinical ASR performance in real-world noisy settings.
Findings
Achieved a 17.1% relative WER reduction with multi-domain Experience Replay.
Reduced catastrophic forgetting by 55% compared to naive adaptation.
Demonstrated the importance of acoustic adaptation for healthcare ASR usability.
Abstract
Automatic Speech Recognition (ASR) holds immense potential to assist in clinical documentation and patient report generation, particularly in resource-constrained regions. However, deployment is currently hindered by a technical deadlock: a severe "Reality Gap" between laboratory performance and noisy, real-world clinical audio, coupled with strict privacy and resource constraints. Such adaptation is essential for clinical telephony systems, where patient speech is highly variable and transcription errors can directly impact downstream clinical workflows. We quantify this gap, showing that a robust multilingual model (IndicWav2Vec) degrades up to a 40.94% WER on rural clinical telephony speech from India, rendering it unusable. We demonstrate consistent improvements on these helpline interactions without transmitting raw patient data off-device via an on-device continual adaptation…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMachine Learning in Healthcare · Speech Recognition and Synthesis · Domain Adaptation and Few-Shot Learning
