Building an ASR Error Robust Spoken Virtual Patient System in a Highly   Class-Imbalanced Scenario Without Speech Data

Vishal Sunder; Prashant Serai; Eric Fosler-Lussier

arXiv:2204.05183·cs.CL·July 4, 2022·1 cites

Building an ASR Error Robust Spoken Virtual Patient System in a Highly Class-Imbalanced Scenario Without Speech Data

Vishal Sunder, Prashant Serai, Eric Fosler-Lussier

PDF

Open Access

TL;DR

This paper presents a novel training approach for a Virtual Patient system that effectively handles ASR errors and class imbalance without requiring spoken data, improving intent classification accuracy.

Contribution

The authors introduce a two-step training method that uses an ASR error predictor and does not depend on spoken data, addressing both ASR errors and class imbalance simultaneously.

Findings

01

Significant improvement over baselines at various WER levels

02

Effective handling of class imbalance in SLU training

03

No spoken data needed for training, only text data with error prediction

Abstract

A Virtual Patient (VP) is a powerful tool for training medical students to take patient histories, where responding to a diverse set of spoken questions is essential to simulate natural conversations with a student. The performance of such a Spoken Language Understanding system (SLU) can be adversely affected by both the presence of Automatic Speech Recognition (ASR) errors in the test data and a high degree of class imbalance in the SLU training data. While these two issues have been addressed separately in prior work, we develop a novel two-step training methodology that tackles both these issues effectively in a single dialog agent. As it is difficult to collect spoken data from users without a functioning SLU system, our method does not rely on spoken data for training, rather we use an ASR error predictor to "speechify" the text data. Our method shows significant improvements over…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Topic Modeling · Natural Language Processing Techniques