MultiGen: Child-Friendly Multilingual Speech Generator with LLMs

Xiaoxue Gao; Huayun Zhang; Nancy F. Chen

arXiv:2508.08715·eess.AS·September 5, 2025

MultiGen: Child-Friendly Multilingual Speech Generator with LLMs

Xiaoxue Gao, Huayun Zhang, Nancy F. Chen

PDF

TL;DR

MultiGen is a multilingual, child-friendly speech generation model leveraging LLMs, designed to improve communication for children in low-resource languages with culturally relevant content.

Contribution

It introduces a novel LLM-based approach for multilingual, child-friendly speech synthesis tailored for low-resource languages and diverse cultural contexts.

Findings

01

MultiGen outperforms baseline methods in objective metrics.

02

Subjective evaluations favor MultiGen's speech quality and cultural relevance.

03

Effective for three low-resource languages: Mandarin, Malay, Tamil.

Abstract

Generative speech models have demonstrated significant potential in improving human-machine interactions, offering valuable real-world applications such as language learning for children. However, achieving high-quality, child-friendly speech generation remains challenging, particularly for low-resource languages across diverse languages and cultural contexts. In this paper, we propose MultiGen, a multilingual speech generation model with child-friendly interaction, leveraging LLM architecture for speech generation tailored for low-resource languages. We propose to integrate age-appropriate multilingual speech generation using LLM architectures, which can be used to facilitate young children's communication with AI systems through culturally relevant context in three low-resource languages: Singaporean accent Mandarin, Malay, and Tamil. Experimental results from both objective metrics…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.