Articulation GAN: Unsupervised modeling of articulatory learning
Ga\v{s}per Begu\v{s}, Alan Zhou, Peter Wu, Gopala K Anumanchipalli

TL;DR
This paper introduces an unsupervised generative model that learns to produce articulatory representations of speech, closely mimicking human speech production, and then converts these to waveforms for speech synthesis.
Contribution
It presents the Articulatory Generator, a novel unsupervised model that learns to generate articulatory features and transforms them into speech, bridging physical speech production and neural network modeling.
Findings
The model learns to control articulators similarly to humans.
Generated speech includes both seen and unseen words.
Articulatory representations have implications for cognitive speech models.
Abstract
Generative deep neural networks are widely used for speech synthesis, but most existing models directly generate waveforms or spectral outputs. Humans, however, produce speech by controlling articulators, which results in the production of speech sounds through physical properties of sound propagation. We introduce the Articulatory Generator to the Generative Adversarial Network paradigm, a new unsupervised generative model of speech production/synthesis. The Articulatory Generator more closely mimics human speech production by learning to generate articulatory representations (electromagnetic articulography or EMA) in a fully unsupervised manner. A separate pre-trained physical model (ema2wav) then transforms the generated EMA representations to speech waveforms, which get sent to the Discriminator for evaluation. Articulatory analysis suggests that the network learns to control…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Phonetics and Phonology Research
