MicAugment: One-shot Microphone Style Transfer

Zal\'an Borsos; Yunpeng Li; Beat Gfeller; Marco Tagliasacchi

arXiv:2010.09658·cs.SD·October 20, 2020

MicAugment: One-shot Microphone Style Transfer

Zal\'an Borsos, Yunpeng Li, Beat Gfeller, Marco Tagliasacchi

PDF

5 Repos

TL;DR

MicAugment is a novel one-shot microphone style transfer method that enhances audio model robustness by synthesizing audio under target device conditions using minimal target audio data.

Contribution

It introduces a new approach for one-shot microphone style transfer that identifies and applies device-specific transformations to improve robustness.

Findings

01

Successfully applies style transfer to real audio

02

Significantly improves downstream model robustness

03

Effective as a data augmentation technique

Abstract

A crucial aspect for the successful deployment of audio-based models "in-the-wild" is the robustness to the transformations introduced by heterogeneous acquisition conditions. In this work, we propose a method to perform one-shot microphone style transfer. Given only a few seconds of audio recorded by a target device, MicAugment identifies the transformations associated to the input acquisition pipeline and uses the learned transformations to synthesize audio as if it were recorded under the same conditions as the target audio. We show that our method can successfully apply the style transfer to real audio and that it significantly increases model robustness when used as data augmentation in the downstream tasks.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.