Spectrogram Inpainting for Interactive Generation of Instrument Sounds

Th\'eis Bazin; Ga\"etan Hadjeres; Philippe Esling; Mikhail; Malt

arXiv:2104.07519·cs.SD·April 16, 2021

Spectrogram Inpainting for Interactive Generation of Instrument Sounds

Th\'eis Bazin, Ga\"etan Hadjeres, Philippe Esling, Mikhail, Malt

PDF

1 Repo

TL;DR

This paper introduces a novel spectrogram inpainting method for instrument sound generation, combining VQ-VAE-2 and Transformers, enabling interactive sound shaping for musicians and artists.

Contribution

It adapts image inpainting techniques to spectrograms using VQ-VAE-2 and Transformers, and provides an open-source interactive tool for sound editing.

Findings

01

Effective spectrogram inpainting demonstrated on NSynth dataset

02

Open-source web interface enables interactive sound transformation

03

Method allows fine-grained control over instrument sound synthesis

Abstract

Modern approaches to sound synthesis using deep neural networks are hard to control, especially when fine-grained conditioning information is not available, hindering their adoption by musicians. In this paper, we cast the generation of individual instrumental notes as an inpainting-based task, introducing novel and unique ways to iteratively shape sounds. To this end, we propose a two-step approach: first, we adapt the VQ-VAE-2 image generation architecture to spectrograms in order to convert real-valued spectrograms into compact discrete codemaps, we then implement token-masked Transformers for the inpainting-based generation of these codemaps. We apply the proposed architecture on the NSynth dataset on masked resampling tasks. Most crucially, we open-source an interactive web interface to transform sounds by inpainting, for artists and practitioners alike, opening up to new,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

SonyCSLParis/interactive-spectrogram-inpainting
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

MethodsPixelCNN