Black-box Adaptation of ASR for Accented Speech

Kartik Khandelwal; Preethi Jyothi; Abhijeet Awasthi; Sunita Sarawagi

arXiv:2006.13519·eess.AS·June 25, 2020

Black-box Adaptation of ASR for Accented Speech

Kartik Khandelwal, Preethi Jyothi, Abhijeet Awasthi, Sunita Sarawagi

PDF

1 Repo

TL;DR

This paper presents a novel method for adapting black-box cloud-based ASR systems to accented speech by coupling them with an open-source local model, significantly reducing word error rates for Indian and Australian accents.

Contribution

It introduces a new coupling approach that uses output guidance to improve accent adaptation without requiring access to model parameters.

Findings

01

Achieved up to 28% relative WER reduction on accented speech.

02

Effective adaptation for Indian and Australian accents.

03

Outperforms existing word-level combination strategies.

Abstract

We introduce the problem of adapting a black-box, cloud-based ASR system to speech from a target accent. While leading online ASR services obtain impressive performance on main-stream accents, they perform poorly on sub-populations - we observed that the word error rate (WER) achieved by Google's ASR API on Indian accents is almost twice the WER on US accents. Existing adaptation methods either require access to model parameters or overlay an error-correcting module on output transcripts. We highlight the need for correlating outputs with the original speech to fix accent errors. Accordingly, we propose a novel coupling of an open-source accent-tuned local model with the black-box service where the output from the service guides frame-level inference in the local model. Our fine-grained merging algorithm is better at fixing accent errors than existing word-level combination strategies.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Kartik14/FineMerge
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.