SNIPER Training: Single-Shot Sparse Training for Text-to-Speech

Perry Lam; Huayun Zhang; Nancy F. Chen; Berrak Sisman; Dorien; Herremans

arXiv:2211.07283·eess.AS·June 4, 2024

SNIPER Training: Single-Shot Sparse Training for Text-to-Speech

Perry Lam, Huayun Zhang, Nancy F. Chen, Berrak Sisman, Dorien, Herremans

PDF

Open Access

TL;DR

SNIPER training introduces a decaying sparsity approach for TTS models, accelerating training and achieving better performance with less computational cost compared to traditional methods.

Contribution

The paper proposes SNIPER training, a novel decaying sparsity method for TTS models that improves training efficiency and final performance over constant-sparsity and dense models.

Findings

01

SNIPER training accelerates early training loss reduction.

02

SNIPER models outperform constant-sparsity and dense models.

03

Training time remains comparable to dense models.

Abstract

Text-to-speech (TTS) models have achieved remarkable naturalness in recent years, yet like most deep neural models, they have more parameters than necessary. Sparse TTS models can improve on dense models via pruning and extra retraining, or converge faster than dense models with some performance loss. Thus, we propose training TTS models using decaying sparsity, i.e. a high initial sparsity to accelerate training first, followed by a progressive rate reduction to obtain better eventual performance. This decremental approach differs from current methods of incrementing sparsity to a desired target, which costs significantly more time than dense training. We call our method SNIPER training: Single-shot Initialization Pruning Evolving-Rate training. Our experiments on FastSpeech2 show that we were able to obtain better losses in the first few training epochs with SNIPER, and that the final…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Topic Modeling · Natural Language Processing Techniques

MethodsPruning · SNIPER