TL;DR
This paper introduces a neural network-based method for synthesizing percussive sounds with intuitive control over high-level timbral features, enabling users to shape sounds without deep signal processing knowledge.
Contribution
It presents a novel deep learning approach that maps high-level timbral features to waveforms, with datasets and evaluation methods for sound quality and control effectiveness.
Findings
The model produces sounds matching specified timbral features.
Subjective listening tests confirm high sound quality.
The approach enables intuitive sound shaping without signal processing expertise.
Abstract
We present a deep neural network-based methodology for synthesising percussive sounds with control over high-level timbral characteristics of the sounds. This approach allows for intuitive control of a synthesizer, enabling the user to shape sounds without extensive knowledge of signal processing. We use a feedforward convolutional neural network-based architecture, which is able to map input parameters to the corresponding waveform. We propose two datasets to evaluate our approach on both a restrictive context, and in one covering a broader spectrum of sounds. The timbral features used as parameters are taken from recent literature in signal processing. We also use these features for evaluation and validation of the presented model, to ensure that changing the input parameters produces a congruent waveform with the desired characteristics. Finally, we evaluate the quality of the output…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
