Extreme Audio Time Stretching Using Neural Synthesis
Leonardo Fierro, Alec Wright, Vesa V\"alim\"aki, Matti, H\"am\"al\"ainen

TL;DR
This paper introduces a neural synthesis-based method for audio time stretching that significantly improves quality at large stretching factors, especially for environmental sounds, by better modeling transient and noise components.
Contribution
It presents a novel combination of sines-transients-noise decomposition with WaveNet synthesis to enhance large-factor audio time stretching quality.
Findings
Outperforms four existing TSM algorithms in subjective tests
Provides better transient and noise modeling for large stretching factors
Stereo compatible and suitable for media slow motion applications
Abstract
A deep neural network solution for time-scale modification (TSM) focused on large stretching factors is proposed, targeting environmental sounds. Traditional TSM artifacts such as transient smearing, loss of presence, and phasiness are heavily accentuated and cause poor audio quality when the TSM factor is four or larger. The weakness of established TSM methods, often based on a phase vocoder structure, lies in the poor description and scaling of the transient and noise components, or nuances, of a sound. Our novel solution combines a sines-transients-noise decomposition with an independent WaveNet synthesizer to provide a better description of the noise component and an improve sound quality for large stretching factors. Results of a subjective listening test against four other TSM algorithms are reported, showing the proposed method to be often superior. The proposed method is stereo…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHearing Loss and Rehabilitation · Speech and Audio Processing · Music Technology and Sound Studies
MethodsTest · Mixture of Logistic Distributions · Dilated Causal Convolution · WaveNet
