Capture Salient Historical Information: A Fast and Accurate   Non-Autoregressive Model for Multi-turn Spoken Language Understanding

Lizhi Cheng; Weijia jia; Wenmian Yang

arXiv:2206.12209·cs.CL·June 27, 2022

Capture Salient Historical Information: A Fast and Accurate Non-Autoregressive Model for Multi-turn Spoken Language Understanding

Lizhi Cheng, Weijia jia, Wenmian Yang

PDF

Open Access

TL;DR

This paper introduces SHA-LRT, a novel non-autoregressive model for multi-turn spoken language understanding that captures salient historical information efficiently, significantly improving accuracy and inference speed over existing methods.

Contribution

The paper proposes SHA-LRT, a new model combining salient history attention, layer refinement, and slot label generation for fast, accurate multi-turn SLU.

Findings

01

Achieves 17.5% improvement in overall SLU performance

02

Accelerates inference nearly 15 times compared to baselines

03

Effective on both multi-turn and single-turn SLU tasks

Abstract

Spoken Language Understanding (SLU), a core component of the task-oriented dialogue system, expects a shorter inference facing the impatience of human users. Existing work increases inference speed by designing non-autoregressive models for single-turn SLU tasks but fails to apply to multi-turn SLU in confronting the dialogue history. The intuitive idea is to concatenate all historical utterances and utilize the non-autoregressive models directly. However, this approach seriously misses the salient historical information and suffers from the uncoordinated-slot problems. To overcome those shortcomings, we propose a novel model for multi-turn SLU named Salient History Attention with Layer-Refined Transformer (SHA-LRT), which composes of an SHA module, a Layer-Refined Mechanism (LRM), and a Slot Label Generation (SLG) task. SHA captures salient historical information for the current…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Speech and dialogue systems · Intelligent Tutoring Systems and Adaptive Learning

MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · SPEED: Separable Pyramidal Pooling EncodEr-Decoder for Real-Time Monocular Depth Estimation on Low-Resource Settings · Softmax · Residual Connection · Adam · Byte Pair Encoding · Layer Normalization · Absolute Position Encodings