Neural Generation of Dialogue Response Timings

Matthew Roddy; Naomi Harte

arXiv:2005.09128·cs.CL·May 20, 2020

Neural Generation of Dialogue Response Timings

Matthew Roddy, Naomi Harte

PDF

1 Repo

TL;DR

This paper introduces neural models that predict natural response timings in dialogue, enhancing the realism of spoken dialogue systems by aligning with human timing patterns.

Contribution

The paper presents neural models for simulating dialogue response timings, integrating contextual factors to improve naturalness in spoken dialogue systems.

Findings

01

Human listeners prefer certain response timings based on context

02

Models successfully simulate natural timing distributions

03

Integration into SDS increases perceived naturalness

Abstract

The timings of spoken response offsets in human dialogue have been shown to vary based on contextual elements of the dialogue. We propose neural models that simulate the distributions of these response offsets, taking into account the response turn as well as the preceding turn. The models are designed to be integrated into the pipeline of an incremental spoken dialogue system (SDS). We evaluate our models using offline experiments as well as human listening tests. We show that human listeners consider certain response timings to be more natural based on the dialogue context. The introduction of these models into SDS pipelines could increase the perceived naturalness of interactions.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

mattroddy/RTNets
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.