Independent and automatic evaluation of acoustic-to-articulatory   inversion models

Maud Parrot; Juliette Millet; Ewan Dunbar

arXiv:1911.06573·eess.AS·November 25, 2019·5 cites

Independent and automatic evaluation of acoustic-to-articulatory inversion models

Maud Parrot, Juliette Millet, Ewan Dunbar

PDF

Open Access 1 Repo

TL;DR

This paper introduces a new speaker-independent evaluation method for acoustic-to-articulatory inversion models using the ABX task, enabling better assessment of dataset merging and model robustness.

Contribution

It proposes a novel ABX-based evaluation method that is independent of the training dataset, improving assessment of speaker independence and dataset integration in articulatory reconstruction models.

Findings

01

ABX measure provides complementary insights to standard metrics.

02

Model trained on multiple datasets shows improved speaker independence.

03

The new evaluation method enables assessment of dataset merging effects.

Abstract

Reconstruction of articulatory trajectories from the acoustic speech signal has been proposed for improving speech recognition and text-to-speech synthesis. However, to be useful in these settings, articulatory reconstruction must be speaker independent. Furthermore, as most research focuses on single, small datasets with few speakers, robust articulatory reconstrucion could profit from combining datasets. Standard evaluation measures such as root mean square error and Pearson correlation are inappropriate for evaluating the speaker-independence of models or the usefulness of combining datasets. We present a new evaluation for articulatory reconstruction which is independent of the articulatory data set used for training: the phone discrimination ABX task. We use the ABX measure to evaluate a Bi-LSTM based model trained on 3 datasets (14 speakers), and show that it gives information…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

bootphon/articulatory_inversion
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Phonetics and Phonology Research