TL;DR
This paper introduces a novel sparse pursuit algorithm and dictionary learning approach for blind source separation of polyphonic music recordings, leveraging a pitch-invariant spectral representation to improve separation quality.
Contribution
It presents a new sparse pursuit algorithm combined with dictionary learning for pitch-invariant source separation, capable of handling inharmonicity and transferability across recordings.
Findings
High-quality separation when model assumptions are met
Effective handling of inharmonicity in instruments
Dictionary transferability across similar recordings
Abstract
We propose an algorithm for the blind separation of single-channel audio signals. It is based on a parametric model that describes the spectral properties of the sounds of musical instruments independently of pitch. We develop a novel sparse pursuit algorithm that can match the discrete frequency spectra from the recorded signal with the continuous spectra delivered by the model. We first use this algorithm to convert an STFT spectrogram from the recording into a novel form of log-frequency spectrogram whose resolution exceeds that of the mel spectrogram. We then make use of the pitch-invariant properties of that representation in order to identify the sounds of the instruments via the same sparse pursuit method. As the model parameters which characterize the musical instruments are not known beforehand, we train a dictionary that contains them, using a modified version of Adam.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsAdam
