Articulatory-WaveNet: Autoregressive Model For Acoustic-to-Articulatory Inversion
Narjes Bozorg, Michael T.Johnson

TL;DR
This paper introduces Articulatory-WaveNet, a novel autoregressive model based on WaveNet architecture for acoustic-to-articulatory inversion, demonstrating significant improvements over traditional methods in correlation and accuracy.
Contribution
The paper presents the first application of a waveform synthesis approach to acoustic-to-articulatory inversion, leveraging WaveNet architecture for improved performance.
Findings
Average correlation of 0.83, a 36% improvement over baseline
Significant reduction in RMSE for articulatory trajectories
First use of point-by-point waveform synthesis in this domain
Abstract
This paper presents Articulatory-WaveNet, a new approach for acoustic-to-articulator inversion. The proposed system uses the WaveNet speech synthesis architecture, with dilated causal convolutional layers using previous values of the predicted articulatory trajectories conditioned on acoustic features. The system was trained and evaluated on the ElectroMagnetic Articulography corpus of Mandarin Accented English (EMA-MAE),consisting of 39 speakers including both native English speakers and native Mandarin speakers speaking English. Results show significant improvement in both correlation and RMSE between the generated and true articulatory trajectories for the new method, with an average correlation of 0.83, representing a 36% relative improvement over the 0.61 correlation obtained with a baseline Hidden Markov Model (HMM)-Gaussian Mixture Model (GMM) inversion framework. To the best of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Phonetics and Phonology Research
MethodsMixture of Logistic Distributions · Dilated Causal Convolution · WaveNet
