Generative Modeling for Low Dimensional Speech Attributes with Neural Spline Flows
Kevin J. Shih, Rafael Valle, Rohan Badlani, Jo\~ao Felipe Santos,, Bryan Catanzaro

TL;DR
This paper investigates neural spline flows for modeling low-dimensional, discontinuous pitch attributes in speech synthesis, aiming to improve fine-grained control over speech prosody in generative models.
Contribution
It introduces techniques for modeling low-dimensional, discontinuous pitch features using Neural Spline flows in the context of speech synthesis.
Findings
Neural Spline flows effectively model discontinuous pitch attributes.
The proposed methods improve controllability of speech pitch in generative models.
Neural Spline flows outperform traditional affine-coupling mechanisms in this setting.
Abstract
Despite recent advances in generative modeling for text-to-speech synthesis, these models do not yet have the same fine-grained adjustability of pitch-conditioned deterministic models such as FastPitch and FastSpeech2. Pitch information is not only low-dimensional, but also discontinuous, making it particularly difficult to model in a generative setting. Our work explores several techniques for handling the aforementioned issues in the context of Normalizing Flow models. We also find this problem to be very well suited for Neural Spline flows, which is a highly expressive alternative to the more common affine-coupling mechanism in Normalizing Flows.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Topic Modeling · Speech Recognition and Synthesis
MethodsMulti-Head Attention · Attention Is All You Need · Linear Layer · Convolution · Residual Connection · Layer Normalization · Dense Connections · Softmax · Normalizing Flows · FastPitch
