Generacion de voces artificiales infantiles en castellano con acento   costarricense

Ana Lilia Alvarez-Blanco; Eugenia Cordoba-Warner; Marvin Coto-Jimenez,; Vivian Fallas-Lopez; Maribel Morales Rodriguez

arXiv:2102.01692·cs.SD·February 4, 2021

Generacion de voces artificiales infantiles en castellano con acento costarricense

Ana Lilia Alvarez-Blanco, Eugenia Cordoba-Warner, Marvin Coto-Jimenez,, Vivian Fallas-Lopez, Maribel Morales Rodriguez

PDF

Open Access

TL;DR

This paper explores the initial development of Costa Rican-accented artificial children's voices using statistical parametric speech synthesis, highlighting challenges in intelligibility and demographic detection.

Contribution

It presents a novel application of Hidden Markov Model-based synthesis for generating Costa Rican-accented children's voices and provides a subjective evaluation of the results.

Findings

01

Lower intelligibility compared to natural voices

02

Significant difficulty in detecting age and gender

03

Highlights need for larger datasets

Abstract

This article evaluates a first experience of generating artificial children's voices with a Costa Rican accent, using the technique of statistical parametric speech synthesis based on Hidden Markov Models. The process of recording the voice samples used for learning the models, the fundamentals of the technique used and the subjective evaluation of the results through the perception of a group of people is described. The results show that the intelligibility of the results, evaluated in isolated words, is lower than the voices recorded by the group of participating children. Similarly, the detection of the age and gender of the speaking person is significantly affected in artificial voices, relative to recordings of natural voices. These results show the need to obtain larger amounts of data, in addition to becoming a numerical reference for future developments resulting from new data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis