Bunched LPCNet2: Efficient Neural Vocoders Covering Devices from Cloud to Edge
Sangjun Park, Kihyun Choo, Joohyung Lee, Anton V. Porov, Konstantin, Osipov, June Sig Sung

TL;DR
This paper introduces Bunched LPCNet2, an efficient neural vocoder that delivers high-quality speech synthesis on both cloud and edge devices, optimizing for low complexity, small footprint, and real-time performance.
Contribution
It proposes an improved LPCNet architecture with a logistic distribution and dual-rate design, achieving high speech quality with minimal model size and computational cost.
Findings
Achieves 1.1MB model size with satisfactory speech quality
Operates faster than real-time on Raspberry Pi 3B
Maintains high speech quality with reduced model footprint
Abstract
Text-to-Speech (TTS) services that run on edge devices have many advantages compared to cloud TTS, e.g., latency and privacy issues. However, neural vocoders with a low complexity and small model footprint inevitably generate annoying sounds. This study proposes a Bunched LPCNet2, an improved LPCNet architecture that provides highly efficient performance in high-quality for cloud servers and in a low-complexity for low-resource edge devices. Single logistic distribution achieves computational efficiency, and insightful tricks reduce the model footprint while maintaining speech quality. A DualRate architecture, which generates a lower sampling rate from a prosody model, is also proposed to reduce maintenance costs. The experiments demonstrate that Bunched LPCNet2 generates satisfactory speech quality with a model footprint of 1.1MB while operating faster than real-time on a RPi 3B. Our…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing
