TL;DR
Sound2Synth is a deep learning pipeline that estimates FM synthesizer parameters from audio, achieving state-of-the-art results and real-world applicability on Dexed synthesizer sounds.
Contribution
The paper introduces a novel multi-modal deep learning approach with a specialized network architecture for accurate synthesizer parameter estimation.
Findings
Achieved state-of-the-art performance in parameter estimation.
First real-world applicable results on Dexed FM synthesizer.
Demonstrated effectiveness of Prime-Dilated Convolution network.
Abstract
Synthesizer is a type of electronic musical instrument that is now widely used in modern music production and sound design. Each parameters configuration of a synthesizer produces a unique timbre and can be viewed as a unique instrument. The problem of estimating a set of parameters configuration that best restore a sound timbre is an important yet complicated problem, i.e.: the synthesizer parameters estimation problem. We proposed a multi-modal deep-learning-based pipeline Sound2Synth, together with a network structure Prime-Dilated Convolution (PDC) specially designed to solve this problem. Our method achieved not only SOTA but also the first real-world applicable results on Dexed synthesizer, a popular FM synthesizer.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPrime Dilated Convolution · Dilated Convolution · Convolution
