Audio Source Separation with Discriminative Scattering Networks

Pablo Sprechmann; Joan Bruna; Yann LeCun

arXiv:1412.7022·cs.SD·April 29, 2015·ICLR·2 cites

Audio Source Separation with Discriminative Scattering Networks

Pablo Sprechmann, Joan Bruna, Yann LeCun

PDF

Open Access

TL;DR

This paper introduces a multi-resolution wavelet scattering representation for single-channel audio source separation, demonstrating improved results over fixed-resolution methods and exploring discriminative training with neural networks.

Contribution

It proposes a novel multi-resolution wavelet scattering approach for audio separation and integrates it into discriminative neural network training regimes, advancing the state of the art.

Findings

01

Multi-resolution scattering improves source separation performance.

02

Discriminative training with neural networks enhances separation quality.

03

The approach generalizes Constant Q Transforms with additional convolution layers.

Abstract

In this report we describe an ongoing line of research for solving single-channel source separation problems. Many monaural signal decomposition techniques proposed in the literature operate on a feature space consisting of a time-frequency representation of the input data. A challenge faced by these approaches is to effectively exploit the temporal dependencies of the signals at scales larger than the duration of a time-frame. In this work we propose to tackle this problem by modeling the signals using a time-frequency representation with multiple temporal resolutions. The proposed representation consists of a pyramid of wavelet scattering operators, which generalizes Constant Q Transforms (CQT) with extra layers of convolution and complex modulus. We first show that learning standard models with this multi-resolution setting improves source separation results over fixed-resolution…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Blind Source Separation Techniques · Image and Signal Denoising Methods

MethodsConvolution