Latent Iterative Refinement for Modular Source Separation
Dimitrios Bralios, Efthymios Tzinis, Gordon Wichern, Paris Smaragdis,, Jonathan Le Roux

TL;DR
This paper introduces a novel iterative refinement approach for source separation that enhances resource efficiency during training and inference by reusing processing blocks and dynamically adjusting iterations.
Contribution
It proposes reformulating source separation models as iterative latent mappings, enabling efficient training and inference through block reuse and dynamic iteration control.
Findings
Reduces memory requirements during training.
Improves resource efficiency with iterative processing.
Enables dynamic inference adjustments.
Abstract
Traditional source separation approaches train deep neural network models end-to-end with all the data available at once by minimizing the empirical risk on the whole training set. On the inference side, after training the model, the user fetches a static computation graph and runs the full model on some specified observed mixture signal to get the estimated source signals. Additionally, many of those models consist of several basic processing blocks which are applied sequentially. We argue that we can significantly increase resource efficiency during both training and inference stages by reformulating a model's training and inference procedures as iterative mappings of latent signal representations. First, we can apply the same processing block more than once on its output to refine the input signal and consequently improve parameter efficiency. During training, we can follow a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Speech Recognition and Synthesis · Music and Audio Processing
