Enhancing the Intelligibility of Cleft Lip and Palate Speech using Cycle-consistent Adversarial Networks
Protima Nomo Sudro, Rohan Kumar Das, Rohit Sinha, S R Mahadeva, Prasanna

TL;DR
This paper proposes using CycleGAN, a type of generative adversarial network, to enhance speech intelligibility in children with cleft lip and palate, aiming to aid speech therapy and improve accessibility.
Contribution
It introduces a novel application of CycleGAN for CLP speech enhancement, trained on Kannada-speaking children's data, with both objective and subjective evaluation confirming effectiveness.
Findings
Improved speech recognition accuracy on enhanced speech
Subjective evaluations show increased intelligibility
CycleGAN effectively models CLP speech characteristics
Abstract
Cleft lip and palate (CLP) refer to a congenital craniofacial condition that causes various speech-related disorders. As a result of structural and functional deformities, the affected subjects' speech intelligibility is significantly degraded, limiting the accessibility and usability of speech-controlled devices. Towards addressing this problem, it is desirable to improve the CLP speech intelligibility. Moreover, it would be useful during speech therapy. In this study, the cycle-consistent adversarial network (CycleGAN) method is exploited for improving CLP speech intelligibility. The model is trained on native Kannada-speaking childrens' speech data. The effectiveness of the proposed approach is also measured using automatic speech recognition performance. Further, subjective evaluation is performed, and those results also confirm the intelligibility improvement in the enhanced speech…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
