CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks
Ga\v{s}per Begu\v{s}

TL;DR
This paper introduces ciwGAN and fiwGAN architectures that encode lexical information in raw acoustic data using GANs combined with information theory, enabling unsupervised lexical learning and innovative speech generation.
Contribution
The paper presents novel GAN-based models with a new latent space structure for unsupervised lexical learning from raw speech, allowing low-dimensional, feature-rich lexical representations.
Findings
Networks encode lexical items as categorical variables in latent space.
Manipulating latent variables can generate specific lexical items.
Networks produce linguistically meaningful novel lexical outputs.
Abstract
How can deep neural networks encode information that corresponds to words in human speech into raw acoustic data? This paper proposes two neural network architectures for modeling unsupervised lexical learning from raw acoustic inputs, ciwGAN (Categorical InfoWaveGAN) and fiwGAN (Featural InfoWaveGAN), that combine a Deep Convolutional GAN architecture for audio data (WaveGAN; arXiv:1705.07904) with an information theoretic extension of GAN -- InfoGAN (arXiv:1606.03657), and propose a new latent space structure that can model featural learning simultaneously with a higher level classification and allows for a very low-dimension vector representation of lexical items. Lexical learning is modeled as emergent from an architecture that forces a deep neural network to output data such that unique information is retrievable from its acoustic outputs. The networks trained on lexical items from…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsDense Connections · Softmax · *Communicated@Fast*How Do I Communicate to Expedia? · Batch Normalization · Feedforward Network · Convolution · HuMan(Expedia)||How do I get a human at Expedia? · InfoGAN · Deep Convolutional GAN
