FreeV: Free Lunch For Vocoders Through Pseudo Inversed Mel Filter
Yuanjun Lv, Hai Li, Ying Yan, Junhui Liu, Danming Xie, Lei Xie

TL;DR
FreeV is a novel frequency-domain vocoder that uses pseudo-inverse initialization to significantly reduce parameters and improve inference speed while maintaining high speech quality, advancing real-time TTS capabilities.
Contribution
The paper introduces FreeV, a vocoder that employs pseudo-inverse initialization and a streamlined amplitude prediction branch to reduce parameters and enhance inference speed without sacrificing quality.
Findings
FreeV achieves 1.8x faster inference than APNet2.
FreeV has nearly half the parameters of APNet2.
FreeV outperforms APNet2 in speech resynthesis quality.
Abstract
Vocoders reconstruct speech waveforms from acoustic features and play a pivotal role in modern TTS systems. Frequent-domain GAN vocoders like Vocos and APNet2 have recently seen rapid advancements, outperforming time-domain models in inference speed while achieving comparable audio quality. However, these frequency-domain vocoders suffer from large parameter sizes, thus introducing extra memory burden. Inspired by PriorGrad and SpecGrad, we employ pseudo-inverse to estimate the amplitude spectrum as the initialization roughly. This simple initialization significantly mitigates the parameter demand for vocoder. Based on APNet2 and our streamlined Amplitude prediction branch, we propose our FreeV, compared with its counterpart APNet2, our FreeV achieves 1.8 times inference speed improvement with nearly half parameters. Meanwhile, our FreeV outperforms APNet2 in resynthesis quality,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNanomaterials and Printing Technologies · Artificial Immune Systems Applications · Intravenous Infusion Technology and Safety
