HyperGANStrument: Instrument Sound Synthesis and Editing with   Pitch-Invariant Hypernetworks

Zhe Zhang; Taketo Akama

arXiv:2401.04558·cs.SD·January 10, 2024·1 cites

HyperGANStrument: Instrument Sound Synthesis and Editing with Pitch-Invariant Hypernetworks

Zhe Zhang, Taketo Akama

PDF

Open Access

TL;DR

HyperGANStrument introduces a pitch-invariant hypernetwork to modulate a pre-trained GAN-based instrument sound synthesizer, significantly improving sound reconstruction fidelity, editability, and diversity through adversarial fine-tuning.

Contribution

It presents a novel pitch-invariant hypernetwork approach that enhances a GAN-based instrument sound synthesizer's reconstruction and editing capabilities.

Findings

01

Improved sound reconstruction fidelity and diversity.

02

Enhanced editability of synthesized instrument sounds.

03

Significant performance gains demonstrated in experiments.

Abstract

GANStrument, exploiting GANs with a pitch-invariant feature extractor and instance conditioning technique, has shown remarkable capabilities in synthesizing realistic instrument sounds. To further improve the reconstruction ability and pitch accuracy to enhance the editability of user-provided sound, we propose HyperGANStrument, which introduces a pitch-invariant hypernetwork to modulate the weights of a pre-trained GANStrument generator, given a one-shot sound as input. The hypernetwork modulation provides feedback for the generator in the reconstruction of the input sound. In addition, we take advantage of an adversarial fine-tuning scheme for the hypernetwork to improve the reconstruction fidelity and generation diversity of the generator. Experimental results show that the proposed model not only enhances the generation capability of GANStrument but also significantly improves the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Music Technology and Sound Studies · Speech and Audio Processing

MethodsHyperNetwork