MultiGen: Child-Friendly Multilingual Speech Generator with LLMs
Xiaoxue Gao, Huayun Zhang, Nancy F. Chen

TL;DR
MultiGen is a multilingual, child-friendly speech generation model leveraging LLMs, designed to improve communication for children in low-resource languages with culturally relevant content.
Contribution
It introduces a novel LLM-based approach for multilingual, child-friendly speech synthesis tailored for low-resource languages and diverse cultural contexts.
Findings
MultiGen outperforms baseline methods in objective metrics.
Subjective evaluations favor MultiGen's speech quality and cultural relevance.
Effective for three low-resource languages: Mandarin, Malay, Tamil.
Abstract
Generative speech models have demonstrated significant potential in improving human-machine interactions, offering valuable real-world applications such as language learning for children. However, achieving high-quality, child-friendly speech generation remains challenging, particularly for low-resource languages across diverse languages and cultural contexts. In this paper, we propose MultiGen, a multilingual speech generation model with child-friendly interaction, leveraging LLM architecture for speech generation tailored for low-resource languages. We propose to integrate age-appropriate multilingual speech generation using LLM architectures, which can be used to facilitate young children's communication with AI systems through culturally relevant context in three low-resource languages: Singaporean accent Mandarin, Malay, and Tamil. Experimental results from both objective metrics…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
