HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement
Pavel Andreev, Aibek Alanov, Oleg Ivanov, Dmitry Vetrov

TL;DR
HiFi++ is a versatile framework that extends HiFi vocoders to bandwidth extension and speech enhancement, achieving state-of-the-art results with less computation through improved generator architecture.
Contribution
The paper introduces HiFi++, a unified framework that adapts HiFi vocoders for multiple audio generation tasks, demonstrating improved efficiency and performance.
Findings
Outperforms or matches state-of-the-art methods in bandwidth extension and speech enhancement.
Reduces computational resource requirements compared to existing approaches.
Validated through extensive experiments showing effectiveness.
Abstract
Generative adversarial networks have recently demonstrated outstanding performance in neural vocoding outperforming best autoregressive and flow-based models. In this paper, we show that this success can be extended to other tasks of conditional audio generation. In particular, building upon HiFi vocoders, we propose a novel HiFi++ general framework for bandwidth extension and speech enhancement. We show that with the improved generator architecture, HiFi++ performs better or comparably with the state-of-the-art in these tasks while spending significantly less computational resources. The effectiveness of our approach is validated through a series of extensive experiments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Data Compression Techniques · Speech Recognition and Synthesis
