Autoencoding Neural Networks as Musical Audio Synthesizers
Joseph Colonel, Christopher Curro, Sam Keene

TL;DR
This paper introduces a lightweight autoencoding neural network approach for musical audio synthesis that compresses and reconstructs spectrograms, producing high-quality audio with real-time phase estimation.
Contribution
The paper presents a novel autoencoder-based method for musical audio synthesis that is more lightweight than existing models and includes an open-source implementation.
Findings
Achieves real-time audio synthesis with low computational cost
Provides quantitative metrics demonstrating synthesis quality
Offers an open-source Python implementation
Abstract
A method for musical audio synthesis using autoencoding neural networks is proposed. The autoencoder is trained to compress and reconstruct magnitude short-time Fourier transform frames. The autoencoder produces a spectrogram by activating its smallest hidden layer, and a phase response is calculated using real-time phase gradient heap integration. Taking an inverse short-time Fourier transform produces the audio signal. Our algorithm is light-weight when compared to current state-of-the-art audio-producing machine learning algorithms. We outline our design process, produce metrics, and detail an open-source Python implementation of our model.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Music Technology and Sound Studies · Neuroscience and Music Perception
MethodsSolana Customer Service Number +1-833-534-1729
