Differentiable Dictionary Search: Integrating Linear Mixing with Deep Non-Linear Modelling for Audio Source Separation
Luk\'a\v{s} Samuel Mart\'ak, Rainer Kelz, Gerhard Widmer

TL;DR
This paper introduces an improved Differentiable Dictionary Search method that combines linear and non-linear models using normalizing flows, enhancing audio source separation, specifically for piano transcription, with better sparsity and precision.
Contribution
The paper advances DDS by making it scalable and integrating deep invertible density estimators, improving decomposition quality over traditional linear methods.
Findings
Enhanced sparsity and precision in source decomposition
Better performance compared to linear NMF baseline
Scalability improvements for practical application
Abstract
This paper describes several improvements to a new method for signal decomposition that we recently formulated under the name of Differentiable Dictionary Search (DDS). The fundamental idea of DDS is to exploit a class of powerful deep invertible density estimators called normalizing flows, to model the dictionary in a linear decomposition method such as NMF, effectively creating a bijection between the space of dictionary elements and the associated probability space, allowing a differentiable search through the dictionary space, guided by the estimated densities. As the initial formulation was a proof of concept with some practical limitations, we will present several steps towards making it scalable, hoping to improve both the computational complexity of the method and its signal decomposition capabilities. As a testbed for experimental evaluation, we choose the task of frame-level…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Music Technology and Sound Studies
