Audio inpainting with generative adversarial network

P. P. Ebner; A. Eltelt

arXiv:2003.07704·eess.AS·March 18, 2020·25 cites

Audio inpainting with generative adversarial network

P. P. Ebner, A. Eltelt

PDF

Open Access 1 Repo

TL;DR

This paper explores using a novel Wasserstein GAN architecture for long-range audio inpainting, demonstrating improved quality and frequency content reconstruction over classical models across different instruments and contexts.

Contribution

Introduces a new WGAN architecture with short- and long-range border handling for improved long-range audio inpainting.

Findings

01

Proposed model outperforms classical WGAN in audio inpainting quality.

02

Better reconstruction of high-frequency content.

03

More effective for instruments with lower frequency spectra.

Abstract

We study the ability of Wasserstein Generative Adversarial Network (WGAN) to generate missing audio content which is, in context, (statistically similar) to the sound and the neighboring borders. We deal with the challenge of audio inpainting long range gaps (500 ms) using WGAN models. We improved the quality of the inpainting part using a new proposed WGAN architecture that uses a short-range and a long-range neighboring borders compared to the classical WGAN model. The performance was compared with two different audio instruments (piano and guitar) and on virtuoso pianists together with a string orchestra. The objective difference grading (ODG) was used to evaluate the performance of both architectures. The proposed model outperforms the classical WGAN model and improves the reconstruction of high-frequency content. Further, we got better results for instruments where the frequency…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

nperraud/gan_audio_inpainting
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Generative Adversarial Networks and Image Synthesis · Speech and Audio Processing

MethodsConvolution · Wasserstein GAN