Music Source Restoration with Ensemble Separation and Targeted Reconstruction
Xinlong Deng, Yu Xia, Jie Jiang

TL;DR
This paper introduces a two-stage music source restoration system that combines ensemble separation models with targeted reconstruction, effectively reversing complex production effects to recover original music stems, and achieves top performance on the MSR benchmark.
Contribution
The paper presents a novel two-stage approach integrating ensemble separation and BSRNN-based restoration for music source recovery, outperforming existing methods on the MSR benchmark.
Findings
Outperforms baseline methods on all metrics
Ranks second among all submissions in the MSR challenge
Demonstrates effectiveness of combined separation and restoration approach
Abstract
The Inaugural Music Source Restoration (MSR) Challenge targets the recovery of original, unprocessed stems from fully mixed and mastered music. Unlike conventional music source separation, MSR requires reversing complex production processes such as equalization, compression, reverberation, and other real-world degradations. To address MSR, we propose a two-stage system. First, an ensemble of pre-trained separation models produces preliminary source estimates. Then a set of pre-trained BSRNN-based restoration models performs targeted reconstruction to refine these estimates. On the official MSR benchmark, our system surpasses the baselines on all metrics, ranking second among all submissions. The code is available at https://github.com/xinghour/Music-source-restoration-CUPAudioGroup
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech and Audio Processing · Hearing Loss and Rehabilitation · Music and Audio Processing
