An Effective Non-Autoregressive Model for Spoken Language Understanding

Lizhi Cheng; Weijia Jia; Wenmian Yang

arXiv:2108.07005·cs.CL·August 17, 2021

An Effective Non-Autoregressive Model for Spoken Language Understanding

Lizhi Cheng, Weijia Jia, Wenmian Yang

PDF

TL;DR

This paper introduces a novel non-autoregressive SLU model called Layered-Refine Transformer that improves inference speed by over 10 times while enhancing accuracy through a layered refinement mechanism and slot label generation tasks.

Contribution

The paper proposes a new non-autoregressive SLU model with a layered refinement mechanism and slot label generation, addressing uncoordinated-slot issues and boosting performance.

Findings

01

Achieves 1.5% improvement in overall accuracy.

02

Speeds up inference by more than 10 times.

03

Outperforms state-of-the-art baseline.

Abstract

Spoken Language Understanding (SLU), a core component of the task-oriented dialogue system, expects a shorter inference latency due to the impatience of humans. Non-autoregressive SLU models clearly increase the inference speed but suffer uncoordinated-slot problems caused by the lack of sequential dependency information among each slot chunk. To gap this shortcoming, in this paper, we propose a novel non-autoregressive SLU model named Layered-Refine Transformer, which contains a Slot Label Generation (SLG) task and a Layered Refine Mechanism (LRM). SLG is defined as generating the next slot label with the token sequence and generated slot labels. With SLG, the non-autoregressive model can efficiently obtain dependency information during training and spend no extra time in inference. LRM predicts the preliminary SLU results from Transformer's middle states and utilizes them to guide the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsAttention Is All You Need · Linear Layer · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Multi-Head Attention · Byte Pair Encoding · Softmax · Layer Normalization · Label Smoothing · Residual Connection