CIS-BWE: Chaos-Informed Speech Bandwidth Extension
Tarikul Islam Tamiti, Tonmoy Das, Nursadul Mamun, Anomadarshi Barua

TL;DR
This paper introduces NDSI-BWE, a novel chaos-informed adversarial framework for speech bandwidth extension that leverages multiple discriminators inspired by nonlinear dynamical systems to improve high-frequency component recovery.
Contribution
It presents a new framework with seven chaos-inspired discriminators and a complex-valued ConformerNeXt generator, achieving state-of-the-art results in speech bandwidth extension.
Findings
Achieves new state-of-the-art performance in BWE across six metrics.
Reduces parameters by eight times using depth-wise convolution.
Outperforms previous methods in subjective human evaluations.
Abstract
Recovering high-frequency components lost to bandwidth constraints is crucial for applications ranging from telecommunications to high-fidelity audio on limited resources. We introduce NDSI-BWE, a new adversarial Band Width Extension (BWE) framework that leverage four new discriminators inspired by nonlinear dynamical system to capture diverse temporal behaviors: a Multi-Resolution Lyapunov Discriminator (MRLD) for determining sensitivity to initial conditions by capturing deterministic chaos, a Multi-Scale Recurrence Discriminator (MS-RD) for self-similar recurrence dynamics, a Multi-Scale Detrended Fractal Analysis Discriminator (MSDFA) for long range slow variant scale invariant relationship, a Multi-Resolution Poincar\'e Plot Discriminator (MR-PPD) for capturing hidden latent space relationship, a Multi-Period Discriminator (MPD) for cyclical patterns, a Multi-Resolution Amplitude…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Data Compression Techniques · Speech and Audio Processing
