SonoEdit: Null-Space Constrained Knowledge Editing for Pronunciation Correction in LLM-Based TTS
Ayush Pratap Singh, Harshit Singh, Nityanand Mathur, Akshat Mandloi, Sudarshan Kamath

TL;DR
SonoEdit is a novel model editing method that precisely corrects pronunciation errors in pre-trained TTS models without retraining, using null-space constrained updates to preserve overall speech quality.
Contribution
The paper introduces Null-Space Pronunciation Editing, a single-shot, mathematically grounded approach for targeted pronunciation correction in TTS models without extensive data or retraining.
Findings
Effective correction of pronunciation errors in low-resource proper nouns.
Preserves the overall speech quality and model behavior.
Requires only a single parameter update, reducing computational costs.
Abstract
Neural text-to-speech (TTS) systems systematically mispronounce low-resource proper nouns, particularly non-English names, brands, and geographic locations, due to their underrepresentation in predominantly English training corpora. Existing solutions typically rely on expensive multilingual data collection, supervised finetuning, or manual phonetic annotation, which limits the deployment of TTS systems in linguistically diverse settings. We introduce SonoEdit, a model editing technique that surgically corrects pronunciation errors in pre-trained TTS models without retraining. Instead of costly finetuning or explicit phoneme injection, we propose a parsimonious alternative based on Null-Space Pronunciation Editing, which performs a single-shot parameter update to modify the pronunciation of specific words while provably preserving all other model behavior. We first adapt Acoustic Causal…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Topic Modeling · Natural Language Processing Techniques
