Instrument Separation of Symbolic Music by Explicitly Guided Diffusion Model
Sangjun Han, Hyeongrae Ihm, DaeHan Ahn, Woohyung Lim

TL;DR
This paper introduces a diffusion model approach for instrument separation in symbolic music, explicitly guiding the process to maintain consistency between mixtures and individual instrument tracks, achieving high-fidelity results.
Contribution
The paper presents a novel diffusion-based method with explicit guidance for instrument separation in symbolic music, improving fidelity and consistency.
Findings
High-fidelity sample generation for multitrack symbolic music
Effective preservation of mixture-instrument consistency
Demonstrated creativity in separated instrument outputs
Abstract
Similar to colorization in computer vision, instrument separation is to assign instrument labels (e.g. piano, guitar...) to notes from unlabeled mixtures which contain only performance information. To address the problem, we adopt diffusion models and explicitly guide them to preserve consistency between mixtures and music. The quantitative results show that our proposed model can generate high-fidelity samples for multitrack symbolic music with creativity.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Neuroscience and Music Perception
