Ring Mixing with Auxiliary Signal-to-Consistency-Error Ratio Loss for Unsupervised Denoising in Speech Separation

Matthew Maciejewski; Samuele Cornell

arXiv:2604.08415·eess.AS·April 10, 2026

Ring Mixing with Auxiliary Signal-to-Consistency-Error Ratio Loss for Unsupervised Denoising in Speech Separation

Matthew Maciejewski, Samuele Cornell

PDF

TL;DR

This paper introduces ring mixing and a new SCER loss to improve unsupervised speech denoising, enabling models to better generalize to real-world noisy speech without clean references.

Contribution

The paper proposes a novel batch strategy and auxiliary loss that break symmetry in training, leading to significant noise reduction in speech separation.

Findings

01

Reduces residual noise by over 50% on WHAM! benchmark.

02

Enables training of denoising systems using only noisy in-the-wild data.

03

Improves generalization to real-world noisy speech scenarios.

Abstract

Noisy speech separation systems are typically trained on fully-synthetic mixtures, limiting generalization to real-world scenarios. Though training on mixtures of in-domain (thus often noisy) speech is possible, we show that this leads to undesirable optima where mixture noise is retained in the estimates, due to the inseparability of the background noises and the loss function's symmetry. To address this, we propose ring mixing, a batch strategy of using each source in two mixtures, alongside a new Signal-to-Consistency-Error Ratio (SCER) auxiliary loss penalizing inconsistent estimates of the same source from different mixtures, breaking symmetry and incentivizing denoising. On a WHAM!-based benchmark, our method can reduce residual noise by upwards of half, effectively learning to denoise from only noisy recordings. This opens the door to training more generalizable systems using…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.