Independent and automatic evaluation of acoustic-to-articulatory inversion models
Maud Parrot, Juliette Millet, Ewan Dunbar

TL;DR
This paper introduces a new speaker-independent evaluation method for acoustic-to-articulatory inversion models using the ABX task, enabling better assessment of dataset merging and model robustness.
Contribution
It proposes a novel ABX-based evaluation method that is independent of the training dataset, improving assessment of speaker independence and dataset integration in articulatory reconstruction models.
Findings
ABX measure provides complementary insights to standard metrics.
Model trained on multiple datasets shows improved speaker independence.
The new evaluation method enables assessment of dataset merging effects.
Abstract
Reconstruction of articulatory trajectories from the acoustic speech signal has been proposed for improving speech recognition and text-to-speech synthesis. However, to be useful in these settings, articulatory reconstruction must be speaker independent. Furthermore, as most research focuses on single, small datasets with few speakers, robust articulatory reconstrucion could profit from combining datasets. Standard evaluation measures such as root mean square error and Pearson correlation are inappropriate for evaluating the speaker-independence of models or the usefulness of combining datasets. We present a new evaluation for articulatory reconstruction which is independent of the articulatory data set used for training: the phone discrimination ABX task. We use the ABX measure to evaluate a Bi-LSTM based model trained on 3 datasets (14 speakers), and show that it gives information…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Phonetics and Phonology Research
