Quantifying the Effects of Prosody Modulation on User Engagement and Satisfaction in Conversational Systems
Jason Ingyu Choi, Eugene Agichtein

TL;DR
This study empirically demonstrates that prosody modulation in conversational systems enhances user satisfaction and engagement, though its effectiveness varies across domains and depends on response content quality.
Contribution
It provides large-scale empirical evidence quantifying how prosody modulation affects user engagement and satisfaction in open-domain conversational systems.
Findings
Prosody modulation significantly increases user satisfaction.
Effects of prosody vary across different conversation domains.
Prosody does not replace the need for coherent, informative responses.
Abstract
As voice-based assistants such as Alexa, Siri, and Google Assistant become ubiquitous, users increasingly expect to maintain natural and informative conversations with such systems. However, for an open-domain conversational system to be coherent and engaging, it must be able to maintain the user's interest for extended periods, without sounding boring or annoying. In this paper, we investigate one natural approach to this problem, of modulating response prosody, i.e., changing the pitch and cadence of the response to indicate delight, sadness or other common emotions, as well as using pre-recorded interjections. Intuitively, this approach should improve the naturalness of the conversation, but attempts to quantify the effects of prosodic modulation on user satisfaction and engagement remain challenging. To accomplish this, we report results obtained from a large-scale empirical study…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
