DisSR: Disentangling Speech Representation for Degradation-Prior Guided Cross-Domain Speech Restoration

Ziqi Liang; Zhijun Jia; Chang Liu; Minghui Yang; Zhihong Lu; Jian Wang

arXiv:2602.12701·cs.SD·February 16, 2026

DisSR: Disentangling Speech Representation for Degradation-Prior Guided Cross-Domain Speech Restoration

Ziqi Liang, Zhijun Jia, Chang Liu, Minghui Yang, Zhihong Lu, Jian Wang

PDF

Open Access

TL;DR

DisSR introduces a general speech restoration model that leverages degradation priors and domain adaptation to effectively restore speech across various distortions and unseen domains, outperforming single-task models.

Contribution

The paper proposes DisSR, a novel disentangling-based speech restoration framework that incorporates degradation-prior guidance and cross-domain training for improved generalization.

Findings

01

Achieves high-quality speech restoration across multiple distortion types.

02

Demonstrates superior generalization to unseen domains.

03

Outperforms existing single-task speech restoration models.

Abstract

Previous speech restoration (SR) primarily focuses on single-task speech restoration (SSR), which cannot address general speech restoration problems. Training specific SSR models for different distortions is time-consuming and lacks generality. In addition, most studies ignore the problem of model generalization across unseen domains. To overcome those limitations, we propose DisSR, a Disentangling Speech Representation based general speech restoration model with two properties: 1) Degradation-prior guidance, which extracts speaker-invariant degradation representation to guide the diffusion-based speech restoration model. 2) Domain adaptation, where we design cross-domain alignment training to enhance the model's adaptability and generalization on cross-domain data, respectively. Experimental results demonstrate that our method can produce high-quality restored speech under various…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Speech Recognition and Synthesis