Finding Strength in Weakness: Learning to Separate Sounds with Weak   Supervision

Fatemeh Pishdadian; Gordon Wichern; Jonathan Le Roux

arXiv:1911.02182·cs.SD·September 1, 2020

Finding Strength in Weakness: Learning to Separate Sounds with Weak Supervision

Fatemeh Pishdadian, Gordon Wichern, Jonathan Le Roux

PDF

TL;DR

This paper introduces a weakly supervised learning approach for audio source separation that does not require isolated source signals during training, enabling separation in more general and less controlled environments.

Contribution

It proposes novel objective functions and network architectures that leverage weak labels, such as clip-level or frame-level annotations, for training source separation models.

Findings

01

Achieves significant SI-SDR improvement with weak supervision

02

Performs well on urban sound mixtures with overlapping events

03

Enables training without isolated source data

Abstract

While there has been much recent progress using deep learning techniques to separate speech and music audio signals, these systems typically require large collections of isolated sources during the training process. When extending audio source separation algorithms to more general domains such as environmental monitoring, it may not be possible to obtain isolated signals for training. Here, we propose objective functions and network architectures that enable training a source separation system with weak labels. In this scenario, weak labels are defined in contrast with strong time-frequency (TF) labels such as those obtained from isolated sources, and refer either to frame-level weak labels where one only has access to the time periods when different sources are active in an audio mixture, or to clip-level weak labels that only indicate the presence or absence of sounds in an entire…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.