auDeep: Unsupervised Learning of Representations from Audio with Deep Recurrent Neural Networks
Michael Freitag, Shahin Amiriparian, Sergey Pugachevskiy, Nicholas, Cummins, Bj\"orn Schuller

TL;DR
auDeep introduces a Python toolkit that employs deep recurrent autoencoders for unsupervised learning of audio representations, effectively capturing temporal dynamics and achieving competitive results in audio classification tasks.
Contribution
It presents a novel toolkit utilizing recurrent sequence autoencoders for unsupervised audio feature learning, with comprehensive API and documentation.
Findings
auDeep features are competitive with state-of-the-art audio classification methods.
The toolkit effectively captures temporal dynamics in acoustic data.
Experimental results validate the approach's effectiveness.
Abstract
auDeep is a Python toolkit for deep unsupervised representation learning from acoustic data. It is based on a recurrent sequence to sequence autoencoder approach which can learn representations of time series data by taking into account their temporal dynamics. We provide an extensive command line interface in addition to a Python API for users and developers, both of which are comprehensively documented and publicly available at https://github.com/auDeep/auDeep. Experimental results indicate that auDeep features are competitive with state-of-the art audio classification.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
