HiFi++: a Unified Framework for Bandwidth Extension and Speech   Enhancement

Pavel Andreev; Aibek Alanov; Oleg Ivanov; Dmitry Vetrov

arXiv:2203.13086·cs.SD·December 12, 2023·1 cites

HiFi++: a Unified Framework for Bandwidth Extension and Speech Enhancement

Pavel Andreev, Aibek Alanov, Oleg Ivanov, Dmitry Vetrov

PDF

Open Access 3 Repos

TL;DR

HiFi++ is a versatile framework that extends HiFi vocoders to bandwidth extension and speech enhancement, achieving state-of-the-art results with less computation through improved generator architecture.

Contribution

The paper introduces HiFi++, a unified framework that adapts HiFi vocoders for multiple audio generation tasks, demonstrating improved efficiency and performance.

Findings

01

Outperforms or matches state-of-the-art methods in bandwidth extension and speech enhancement.

02

Reduces computational resource requirements compared to existing approaches.

03

Validated through extensive experiments showing effectiveness.

Abstract

Generative adversarial networks have recently demonstrated outstanding performance in neural vocoding outperforming best autoregressive and flow-based models. In this paper, we show that this success can be extended to other tasks of conditional audio generation. In particular, building upon HiFi vocoders, we propose a novel HiFi++ general framework for bandwidth extension and speech enhancement. We show that with the improved generator architecture, HiFi++ performs better or comparably with the state-of-the-art in these tasks while spending significantly less computational resources. The effectiveness of our approach is validated through a series of extensive experiments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Advanced Data Compression Techniques · Speech Recognition and Synthesis