# Autoencoder Based Architecture For Fast & Real Time Audio Style Transfer

**Authors:** Dhruv Ramani, Samarjit Karmakar, Anirban Panda, Asad Ahmed, Pratham, Tangri

arXiv: 1812.07159 · 2018-12-27

## TL;DR

This paper introduces a novel autoencoder-based neural network architecture for fast, real-time audio style transfer, capable of generating stylized audio in a single pass with improved quality and training efficiency.

## Contribution

The paper presents a new autoencoder architecture that enhances audio style transfer by enabling single-pass generation and reducing training time.

## Key findings

- Produces high-quality stylized audio
- Operates in real-time with a single forward pass
- Reduces training time compared to existing methods

## Abstract

Recently, there has been great interest in the field of audio style transfer, where a stylized audio is generated by imposing the style of a reference audio on the content of a target audio. We improve on the current approaches which use neural networks to extract the content and the style of the audio signal and propose a new autoencoder based architecture for the task. This network generates a stylized audio for a content audio in a single forward pass. The proposed network architecture proves to be advantageous over the quality of audio produced and the time taken to train the network. The network is experimented on speech signals to confirm the validity of our proposal.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1812.07159/full.md

## Figures

7 figures with captions in the complete paper: https://tomesphere.com/paper/1812.07159/full.md

## References

10 references — full list in the complete paper: https://tomesphere.com/paper/1812.07159/full.md

---
Source: https://tomesphere.com/paper/1812.07159