Synthetic Data Generation for Phrase Break Prediction with Large Language Model
Hoyeon Lee, Sejung Son, Ye-Eun Kang, Jong-Hwan Kim

TL;DR
This paper investigates using large language models to generate synthetic phrase break annotations, aiming to reduce manual effort and improve data quality in speech prosody prediction across multiple languages.
Contribution
It introduces a novel approach of leveraging LLMs for synthetic data generation in phrase break prediction, demonstrating its effectiveness compared to traditional annotations.
Findings
LLM-generated data reduces manual annotation effort.
Synthetic data improves phrase break prediction accuracy.
Method is effective across multiple languages.
Abstract
Current approaches to phrase break prediction address crucial prosodic aspects of text-to-speech systems but heavily rely on vast human annotations from audio or text, incurring significant manual effort and cost. Inherent variability in the speech domain, driven by phonetic factors, further complicates acquiring consistent, high-quality data. Recently, large language models (LLMs) have shown success in addressing data challenges in NLP by generating tailored synthetic data while reducing manual annotation needs. Motivated by this, we explore leveraging LLM to generate synthetic phrase break annotations, addressing the challenges of both manual annotation and speech-related tasks by comparing with traditional annotations and assessing effectiveness across multiple languages. Our findings suggest that LLM-based synthetic data generation effectively mitigates data challenges in phrase…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
