DSP-informed bandwidth extension using locally-conditioned excitation and linear time-varying filter subnetworks
Shahan Nercessian, Alexey Lukin, and Johannes Imort

TL;DR
This paper introduces a dual-stage, DSP-informed approach for bandwidth extension of speech signals from 8 kHz to 48 kHz, explicitly modeling excitation and filtering stages to improve quality over existing deep learning methods.
Contribution
It presents a novel dual-stage architecture with explicit excitation and linear time-varying filter modeling for speech bandwidth extension, enhancing existing deep learning models.
Findings
Improved BWE results over SEANet and HiFi-GAN using the proposed architecture.
Explicit modeling of excitation and filtering stages enhances spectral shaping.
Adaptive processing with acoustic feature predictions outperforms previous methods.
Abstract
In this paper, we propose a dual-stage architecture for bandwidth extension (BWE) increasing the effective sampling rate of speech signals from 8 kHz to 48 kHz. Unlike existing end-to-end deep learning models, our proposed method explicitly models BWE using excitation and linear time-varying (LTV) filter stages. The excitation stage broadens the spectrum of the input, while the filtering stage properly shapes it based on outputs from an acoustic feature predictor. To this end, an acoustic feature loss term can implicitly promote the excitation subnetwork to produce white spectra in the upper frequency band to be synthesized. Experimental results demonstrate that the added inductive bias provided by our approach can improve upon BWE results using the generators from both SEANet or HiFi-GAN as exciters, and that our means of adapting processing with acoustic feature predictions is more…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsPAPR reduction in OFDM · Optical Network Technologies · Advanced Photonic Communication Systems
MethodsHiFi-GAN
