# End-to-end Source Separation with Adaptive Front-Ends

**Authors:** Shrikant Venkataramani, Jonah Casebeer, Paris Smaragdis

arXiv: 1705.02514 · 2017-11-01

## TL;DR

This paper introduces an auto-encoder neural network that learns optimal, real-valued basis functions directly from raw audio waveforms, enabling end-to-end source separation that outperforms traditional Fourier-based methods.

## Contribution

It presents a neural network-based adaptive front-end that replaces Fourier transforms for source separation, with a new cost function for improved performance.

## Key findings

- Learned transforms outperform Fourier transforms in separation quality.
- The neural network effectively learns from raw waveforms.
- Proposed cost function enhances separation results.

## Abstract

Source separation and other audio applications have traditionally relied on the use of short-time Fourier transforms as a front-end frequency domain representation step. The unavailability of a neural network equivalent to forward and inverse transforms hinders the implementation of end-to-end learning systems for these applications. We present an auto-encoder neural network that can act as an equivalent to short-time front-end transforms. We demonstrate the ability of the network to learn optimal, real-valued basis functions directly from the raw waveform of a signal and further show how it can be used as an adaptive front-end for supervised source separation. In terms of separation performance, these transforms significantly outperform their Fourier counterparts. Finally, we also propose a novel source to distortion ratio based cost function for end-to-end source separation.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1705.02514/full.md

## Figures

4 figures with captions in the complete paper: https://tomesphere.com/paper/1705.02514/full.md

## References

17 references — full list in the complete paper: https://tomesphere.com/paper/1705.02514/full.md

---
Source: https://tomesphere.com/paper/1705.02514