Speech vocoding for laboratory phonology
Milos Cernak, Stefan Benus, Alexandros Lazaridis

TL;DR
This paper introduces a phonological speech vocoding platform to explore the relationship between phonology and speech processing, demonstrating applications in phonological modeling, system comparison, and TTS with promising results.
Contribution
It presents a novel platform integrating phonological representations into speech processing, enabling testing of phonological theories and improving speech synthesis applications.
Findings
eSPE-based vocoded speech outperforms GP-based speech
GP performs comparably to more complex systems
Unsupervised phonological TTS achieves 85% intelligibility
Abstract
Using phonological speech vocoding, we propose a platform for exploring relations between phonology and speech processing, and in broader terms, for exploring relations between the abstract and physical structures of a speech signal. Our goal is to make a step towards bridging phonology and speech processing and to contribute to the program of Laboratory Phonology. We show three application examples for laboratory phonology: compositional phonological speech modelling, a comparison of phonological systems and an experimental phonological parametric text-to-speech (TTS) system. The featural representations of the following three phonological systems are considered in this work: (i) Government Phonology (GP), (ii) the Sound Pattern of English (SPE), and (iii) the extended SPE (eSPE). Comparing GP- and eSPE-based vocoded speech, we conclude that the latter achieves slightly better results…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
