Universal Score-based Speech Enhancement with High Content Preservation
Robin Scheibler, Yusuke Fujita, Yuma Shirahata, Tatsuya Komatsu

TL;DR
UNIVERSE++ is a universal speech enhancement method that combines score-based diffusion, adversarial training, and low-rank adaptation to improve speech quality and content preservation across diverse noisy conditions.
Contribution
The paper introduces architectural improvements, adversarial loss, and low-rank adaptation with phoneme fidelity to enhance a universal speech enhancement model.
Findings
Outperforms existing baselines on multiple benchmark datasets.
Achieves high content preservation and speech intelligibility.
Demonstrates robustness across various noise and distortion types.
Abstract
We propose UNIVERSE++, a universal speech enhancement method based on score-based diffusion and adversarial training. Specifically, we improve the existing UNIVERSE model that decouples clean speech feature extraction and diffusion. Our contributions are three-fold. First, we make several modifications to the network architecture, improving training stability and final performance. Second, we introduce an adversarial loss to promote learning high quality speech features. Third, we propose a low-rank adaptation scheme with a phoneme fidelity loss to improve content preservation in the enhanced speech. In the experiments, we train a universal enhancement model on a large scale dataset of speech degraded by noise, reverberation, and various distortions. The results on multiple public benchmark datasets demonstrate that UNIVERSE++ compares favorably to both discriminative and generative…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Advanced Data Compression Techniques · Advanced Adaptive Filtering Techniques
MethodsDiffusion
