# Deep Tensor Factorization for Spatially-Aware Scene Decomposition

**Authors:** Jonah Casebeer, Michael Colomb, Paris Smaragdis

arXiv: 1905.01391 · 2019-09-30

## TL;DR

This paper introduces an unsupervised deep learning approach for spatially-aware scene decomposition in multi-channel audio, enabling source separation and analysis without labeled data.

## Contribution

It presents a neural network architecture that performs nonnegative tensor factorization, allowing for scalable, end-to-end, and stochastic training on complex audio scenes.

## Key findings

- Effective source separation in multi-microphone recordings
- Scalable tensor decomposition with deep learning techniques
- Flexible extension to other tensor factorization methods

## Abstract

We propose a completely unsupervised method to understand audio scenes observed with random microphone arrangements by decomposing the scene into its constituent sources and their relative presence in each microphone. To this end, we formulate a neural network architecture that can be interpreted as a nonnegative tensor factorization of a multi-channel audio recording. By clustering on the learned network parameters corresponding to channel content, we can learn sources' individual spectral dictionaries and their activation patterns over time. Our method allows us to leverage deep learning advances like end-to-end training, while also allowing stochastic minibatch training so that we can feasibly decompose realistic audio scenes that are intractable to decompose using standard methods. This neural network architecture is easily extensible to other kinds of tensor factorizations.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1905.01391/full.md

## Figures

8 figures with captions in the complete paper: https://tomesphere.com/paper/1905.01391/full.md

## References

17 references — full list in the complete paper: https://tomesphere.com/paper/1905.01391/full.md

---
Source: https://tomesphere.com/paper/1905.01391