Word-Free Spoken Language Understanding for Mandarin-Chinese

Zhiyuan Guo; Yuexin Li; Guo Chen; Xingyu Chen; Akshat Gupta

arXiv:2107.00186·cs.CL·July 2, 2021·1 cites

Word-Free Spoken Language Understanding for Mandarin-Chinese

Zhiyuan Guo, Yuexin Li, Guo Chen, Xingyu Chen, Akshat Gupta

PDF

Open Access

TL;DR

This paper introduces a phone-based spoken language understanding system for Mandarin Chinese that bypasses traditional ASR, using a simple two-block Transformer architecture to directly interpret spoken input.

Contribution

It presents a novel, end-to-end phone-based SLU system that eliminates the need for language-specific ASR modules, simplifying the pipeline for Mandarin Chinese.

Findings

01

Effective intent classification on Mandarin Chinese dataset

02

Reduces reliance on large language-specific training data

03

Demonstrates feasibility of direct phone-based SLU

Abstract

Spoken dialogue systems such as Siri and Alexa provide great convenience to people's everyday life. However, current spoken language understanding (SLU) pipelines largely depend on automatic speech recognition (ASR) modules, which require a large amount of language-specific training data. In this paper, we propose a Transformer-based SLU system that works directly on phones. This acoustic-based SLU system consists of only two blocks and does not require the presence of ASR module. The first block is a universal phone recognition system, and the second block is a Transformer-based language model for phones. We verify the effectiveness of the system on an intent classification dataset in Mandarin Chinese.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech Recognition and Synthesis · Natural Language Processing Techniques