N-Best ASR Transformer: Enhancing SLU Performance using Multiple ASR   Hypotheses

Karthik Ganesan; Pakhi Bamdev; Jaivarsan B; Amresh Venugopal; Abhinav; Tushar

arXiv:2106.06519·cs.CL·June 14, 2021

N-Best ASR Transformer: Enhancing SLU Performance using Multiple ASR Hypotheses

Karthik Ganesan, Pakhi Bamdev, Jaivarsan B, Amresh Venugopal, Abhinav, Tushar

PDF

1 Repo

TL;DR

This paper proposes using concatenated N-best ASR hypotheses as input to transformer models for SLU, improving performance especially in low-data scenarios and enabling use with third-party ASR APIs.

Contribution

It introduces a simple yet effective method of leveraging multiple ASR hypotheses with transformer models for enhanced SLU performance, especially under limited data conditions.

Findings

01

Achieved state-of-the-art results on DSTC2 dataset.

02

Significantly outperformed prior methods in low-data regimes.

03

Method is compatible with third-party ASR APIs lacking lattice info.

Abstract

Spoken Language Understanding (SLU) systems parse speech into semantic structures like dialog acts and slots. This involves the use of an Automatic Speech Recognizer (ASR) to transcribe speech into multiple text alternatives (hypotheses). Transcription errors, common in ASRs, impact downstream SLU performance negatively. Approaches to mitigate such errors involve using richer information from the ASR, either in form of N-best hypotheses or word-lattices. We hypothesize that transformer models learn better with a simpler utterance representation using the concatenation of the N-best ASR alternatives, where each alternative is separated by a special delimiter [SEP]. In our work, we test our hypothesis by using concatenated N-best ASR alternatives as the input to transformer encoder models, namely BERT and XLM-RoBERTa, and achieve performance equivalent to the prior state-of-the-art model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Vernacular-ai/N-Best-ASR-Transformer
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Linear Layer · Multi-Head Attention · Adam · Refunds@Expedia|||How do I get a full refund from Expedia? · Linear Warmup With Linear Decay · Residual Connection · WordPiece · Attention Dropout · Dense Connections