Multi-Task Learning for Domain-General Spoken Disfluency Detection in   Dialogue Systems

Igor Shalyminov; Arash Eshghi; and Oliver Lemon

arXiv:1810.03352·cs.CL·October 9, 2018·5 cites

Multi-Task Learning for Domain-General Spoken Disfluency Detection in Dialogue Systems

Igor Shalyminov, Arash Eshghi, and Oliver Lemon

PDF

Open Access

TL;DR

This paper introduces a multi-task LSTM model for incremental, domain-general detection of spoken disfluencies in dialogue, improving accuracy and demonstrating strong generalization to synthetic datasets without retraining.

Contribution

The paper presents a novel multi-task LSTM approach for incremental disfluency detection that outperforms previous models and generalizes well across different datasets.

Findings

01

Outperforms prior neural approaches by ~10 percentage points on SWDA

02

Achieves good generalization to synthetic bAbI+ dataset without retraining

03

Supports real-time processing for downstream dialogue systems

Abstract

Spontaneous spoken dialogue is often disfluent, containing pauses, hesitations, self-corrections and false starts. Processing such phenomena is essential in understanding a speaker's intended meaning and controlling the flow of the conversation. Furthermore, this processing needs to be word-by-word incremental to allow further downstream processing to begin as early as possible in order to handle real spontaneous human conversational behaviour. In addition, from a developer's point of view, it is highly desirable to be able to develop systems which can be trained from `clean' examples while also able to generalise to the very diverse disfluent variations on the same data -- thereby enhancing both data-efficiency and robustness. In this paper, we present a multi-task LSTM-based model for incremental detection of disfluency structure, which can be hooked up to any component for…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems · Topic Modeling · Natural Language Processing Techniques