Improving End-of-turn Detection in Spoken Dialogues by Detecting Speaker   Intentions as a Secondary Task

Zakaria Aldeneh; Dimitrios Dimitriadis; Emily Mower Provost

arXiv:1805.06511·cs.CL·May 18, 2018

Improving End-of-turn Detection in Spoken Dialogues by Detecting Speaker Intentions as a Secondary Task

Zakaria Aldeneh, Dimitrios Dimitriadis, Emily Mower Provost

PDF

TL;DR

This paper introduces a multi-task neural approach that predicts speaker intentions alongside turn-transitions in spoken dialogues, improving turn-taking prediction without extra runtime features.

Contribution

The novel contribution is the joint modeling of speaker intentions and turn-transitions to enhance turn-taking prediction in spoken dialogues.

Findings

01

Speaker intention prediction improves turn-transition accuracy.

02

The method does not require additional runtime features.

03

Joint modeling outperforms single-task approaches.

Abstract

This work focuses on the use of acoustic cues for modeling turn-taking in dyadic spoken dialogues. Previous work has shown that speaker intentions (e.g., asking a question, uttering a backchannel, etc.) can influence turn-taking behavior and are good predictors of turn-transitions in spoken dialogues. However, speaker intentions are not readily available for use by automated systems at run-time; making it difficult to use this information to anticipate a turn-transition. To this end, we propose a multi-task neural approach for predicting turn- transitions and speaker intentions simultaneously. Our results show that adding the auxiliary task of speaker intention prediction improves the performance of turn-transition prediction in spoken dialogues, without relying on additional input features during run-time.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.