Latent Autoregressive Source Separation
Emilian Postolache, Giorgio Mariani, Michele Mancusi, Andrea Santilli,, Luca Cosmo, Emanuele Rodol\`a

TL;DR
This paper introduces LASS, a novel latent autoregressive source separation method that achieves efficient and scalable signal de-mixing without additional training, leveraging Bayesian formulation and frequency count likelihoods.
Contribution
LASS provides a new non-gradient-based approach for source separation using pre-trained autoregressive models and a Bayesian framework with discrete likelihoods.
Findings
Competitive separation quality on images and audio.
Significant inference speedups over existing methods.
Scalability to higher-dimensional data.
Abstract
Autoregressive models have achieved impressive results over a wide range of domains in terms of generation quality and downstream task performance. In the continuous domain, a key factor behind this success is the usage of quantized latent spaces (e.g., obtained via VQ-VAE autoencoders), which allow for dimensionality reduction and faster inference times. However, using existing pre-trained models to perform new non-trivial tasks is difficult since it requires additional fine-tuning or extensive training to elicit prompting. This paper introduces LASS as a way to perform vector-quantized Latent Autoregressive Source Separation (i.e., de-mixing an input signal into its constituent sources) without requiring additional gradient-based optimization or modifications of existing models. Our separation method relies on the Bayesian formulation in which the autoregressive models are the priors,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
Taxonomy
TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis
MethodsTest · VQ-VAE
