Indonesian-English Code-Switching Speech Synthesizer Utilizing Multilingual STEN-TTS and Bert LID
Ahmad Alfani Handoyo, Chung Tran, Dessi Puji Lestari, Sakriani Sakti

TL;DR
This paper presents a novel Indonesian-English code-switching speech synthesizer that leverages multilingual STEN-TTS and BERT-based language identification to improve naturalness and intelligibility in mixed-language speech synthesis.
Contribution
It introduces a new approach by integrating BERT-based language ID and removing language embedding in STEN-TTS for effective code-switching synthesis.
Findings
Code-switching model outperforms baseline in naturalness
Improved speech intelligibility in Indonesian-English synthesis
Effective handling of mixed-language segments
Abstract
Multilingual text-to-speech systems convert text into speech across multiple languages. In many cases, text sentences may contain segments in different languages, a phenomenon known as code-switching. This is particularly common in Indonesia, especially between Indonesian and English. Despite its significance, no research has yet developed a multilingual TTS system capable of handling code-switching between these two languages. This study addresses Indonesian-English code-switching in STEN-TTS. Key modifications include adding a language identification component to the text-to-phoneme conversion using finetuned BERT for per-word language identification, as well as removing language embedding from the base model. Experimental results demonstrate that the code-switching model achieves superior naturalness and improved speech intelligibility compared to the Indonesian and English baseline…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsRefunds@Expedia|||How do I get a full refund from Expedia? · Attention Is All You Need · Attention Dropout · Linear Layer · Linear Warmup With Linear Decay · Dropout · Softmax · Dense Connections · WordPiece · Residual Connection
