Neural Style Transfer for Audio Spectograms

Prateek Verma; Julius O. Smith

arXiv:1801.01589·cs.SD·December 24, 2024·34 cites

Neural Style Transfer for Audio Spectograms

Prateek Verma, Julius O. Smith

PDF

Open Access 2 Repos

TL;DR

This paper introduces a neural style transfer method for audio spectrograms, enabling artistic sound transformations like bandwidth modification and timbral transfer using a unified neural architecture.

Contribution

It adapts image style transfer techniques to audio, allowing diverse sound transformations with a single neural model, simplifying previous complex signal processing pipelines.

Findings

01

Successfully performed bandwidth expansion and compression.

02

Achieved timbral transfer from singing voice to instruments.

03

Unified approach reduces need for multiple specialized pipelines.

Abstract

There has been fascinating work on creating artistic transformations of images by Gatys. This was revolutionary in how we can in some sense alter the 'style' of an image while generally preserving its 'content'. In our work, we present a method for creating new sounds using a similar approach, treating it as a style-transfer problem, starting from a random-noise input signal and iteratively using back-propagation to optimize the sound to conform to filter-outputs from a pre-trained neural architecture of interest. For demonstration, we investigate two different tasks, resulting in bandwidth expansion/compression, and timbral transfer from singing voice to musical instruments. A feature of our method is that a single architecture can generate these different audio-style-transfer types using the same set of parameters which otherwise require different complex hand-tuned diverse signal…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies