Analysis of a Modern Voice Morphing Approach using Gaussian Mixture Models for Laryngectomees
Aman Chadha, Bharatraaj Savardekar, Jay Padhya

TL;DR
This paper introduces a novel GMM-based voice morphing system for Laryngectomees that overcomes over-smoothening issues, producing high-quality, natural-sounding speech by separating glottal waveforms and predicting excitations.
Contribution
A new GMM-based voice morphing method that eliminates over-smoothening and improves speech naturalness for Laryngectomees, with a focus on glottal waveform separation.
Findings
Over-smoothening is effectively eliminated.
Transformed vocal parameters match target speech.
Synthesized speech is of high quality.
Abstract
This paper proposes a voice morphing system for people suffering from Laryngectomy, which is the surgical removal of all or part of the larynx or the voice box, particularly performed in cases of laryngeal cancer. A primitive method of achieving voice morphing is by extracting the source's vocal coefficients and then converting them into the target speaker's vocal parameters. In this paper, we deploy Gaussian Mixture Models (GMM) for mapping the coefficients from source to destination. However, the use of the traditional/conventional GMM-based mapping approach results in the problem of over-smoothening of the converted voice. Thus, we hereby propose a unique method to perform efficient voice morphing and conversion based on GMM,which overcomes the traditional-method effects of over-smoothening. It uses a technique of glottal waveform separation and prediction of excitations and hence…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
