Recognizing Uncertainty in Speech
Heather Pon-Barry, Stuart M. Shieber

TL;DR
This paper explores how prosodic features in speech can be used to infer a speaker's certainty level, improving classification accuracy and identifying uncertain phrases, with implications for dialogue systems.
Contribution
It introduces a novel method for eliciting varying certainty levels and demonstrates the effectiveness of phrase-level prosodic features in certainty detection.
Findings
Phrase-level prosodic features improve certainty classification.
Models can predict specific uncertain phrases.
Mismatch exists between speakers' internal certainty and perceived certainty.
Abstract
We address the problem of inferring a speaker's level of certainty based on prosodic information in the speech signal, which has application in speech-based dialogue systems. We show that using phrase-level prosodic features centered around the phrases causing uncertainty, in addition to utterance-level prosodic features, improves our model's level of certainty classification. In addition, our models can be used to predict which phrase a person is uncertain about. These results rely on a novel method for eliciting utterances of varying levels of certainty that allows us to compare the utility of contextually-based feature sets. We elicit level of certainty ratings from both the speakers themselves and a panel of listeners, finding that there is often a mismatch between speakers' internal states and their perceived states, and highlighting the importance of this distinction.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
