Intent Classification Using Pre-trained Language Agnostic Embeddings For   Low Resource Languages

Hemant Yadav; Akshat Gupta; Sai Krishna Rallabandi; Alan W Black,; Rajiv Ratn Shah

arXiv:2110.09264·cs.CL·April 19, 2022

Intent Classification Using Pre-trained Language Agnostic Embeddings For Low Resource Languages

Hemant Yadav, Akshat Gupta, Sai Krishna Rallabandi, Alan W Black,, Rajiv Ratn Shah

PDF

Open Access

TL;DR

This paper explores using pre-trained language-agnostic embeddings from an acoustic model to improve spoken intent classification in low-resource languages, demonstrating notable accuracy gains across multiple languages.

Contribution

It introduces a comparative study of three pre-trained acoustic embeddings for intent classification in low-resource languages, showing their effectiveness and scalability.

Findings

01

Improved intent classification accuracy for Sinhala and Tamil.

02

Competitive results achieved on English.

03

Performance scales positively with training data size.

Abstract

Building Spoken Language Understanding (SLU) systems that do not rely on language specific Automatic Speech Recognition (ASR) is an important yet less explored problem in language processing. In this paper, we present a comparative study aimed at employing a pre-trained acoustic model to perform SLU in low resource scenarios. Specifically, we use three different embeddings extracted using Allosaurus, a pre-trained universal phone decoder: (1) Phone (2) Panphone, and (3) Allo embeddings. These embeddings are then used in identifying the spoken intent. We perform experiments across three different languages: English, Sinhala, and Tamil each with different data sizes to simulate high, medium, and low resource scenarios. Our system improves on the state-of-the-art (SOTA) intent classification accuracy by approximately 2.11% for Sinhala and 7.00% for Tamil and achieves competitive results on…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Music and Audio Processing · Speech and Audio Processing