Who Finds This Voice Attractive? A Large-Scale Experiment Using   In-the-Wild Data

Hitoshi Suda; Aya Watanabe; Shinnosuke Takamichi

arXiv:2407.04270·eess.AS·July 8, 2024

Who Finds This Voice Attractive? A Large-Scale Experiment Using In-the-Wild Data

Hitoshi Suda, Aya Watanabe, Shinnosuke Takamichi

PDF

Open Access

TL;DR

This study presents CocoNut-Humoresque, a large-scale, open-source speech likability corpus with listener ratings and speaker attributes, enabling analysis of factors influencing voice attractiveness.

Contribution

It introduces a new extensive dataset for voice likability research, including diverse speaker attributes and listener ratings, and provides initial analysis of biases and acoustic correlations.

Findings

01

Gender and age biases in voice likability identified.

02

Correlation between fundamental frequency, x-vectors, and likability analyzed.

03

Dataset enables large-scale statistical analysis of voice attractiveness.

Abstract

This paper introduces CocoNut-Humoresque, an open-source large-scale speech likability corpus that includes speech segments and their per-listener likability scores. Evaluating voice likability is essential to designing preferable voices for speech systems, such as dialogue or announcement systems. In this study, we let 885 listeners rate 1800 speech segments of a wide range of speakers regarding their likability. When constructing the corpus, we also collected the multiple speaker attributes: genders, ages, and favorite YouTube videos. Therefore, the corpus enables the large-scale statistical analysis of voice likability regarding both speaker and listener factors. This paper describes the construction methodology and preliminary data analysis to reveal the gender and age biases in voice likability. In addition, the relationship between the likability and two acoustic features, the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and dialogue systems