Multi-Reward GRPO for Stable and Prosodic Single-Codebook TTS LLMs at Scale
Yicheng Zhong, Peiji Yang, Zhisheng Wang

TL;DR
This paper introduces a multi-reward reinforcement learning framework to improve prosody, stability, and naturalness in single-codebook TTS large language models, addressing common issues like prosody instability and speaker drift.
Contribution
It proposes a novel multi-reward GRPO method that directly optimizes token generation for better prosody and stability, incorporating rule-based rewards and external LLM annotations.
Findings
Enhanced prosodic stability and naturalness in TTS models.
Consistent improvements across different data sizes and model scales.
Additional gains when attaching a flow-matching decoder.
Abstract
Recent advances in Large Language Models (LLMs) have transformed text-to-speech (TTS) synthesis, inspiring autoregressive frameworks that represent speech as sequences of discrete codec tokens. Among them, single-codebook TTS LLMs have emerged as compact and streamable architectures that jointly model semantic and acoustic integration. However, despite their efficiency, these models often exhibit unstable prosody, speaker drift, and degraded naturalness. To address these issues, we propose a multi-reward Group Relative Policy Optimization (GRPO) framework that directly optimizes the token generation policy of single-codebook TTS LLMs. Beyond standard intelligibility and speaker similarity objectives, our design integrates three rule-based rewards: a length penalty for duration consistency, an entropy regularization reward for decoding stability, and an LLM-annotated prosody alignment…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Voice and Speech Disorders · Phonetics and Phonology Research
