Onssen: an open-source speech separation and enhancement library

Zhaoheng Ni; Michael I Mandel

arXiv:1911.00982·eess.AS·November 5, 2019·5 cites

Onssen: an open-source speech separation and enhancement library

Zhaoheng Ni, Michael I Mandel

PDF

Open Access 1 Repo

TL;DR

Onssen is an open-source library that facilitates speech separation and enhancement research by providing implementations of various deep learning algorithms, supporting customized datasets, and enabling easy comparison of methods.

Contribution

It introduces a comprehensive, open-source platform for speech separation that supports multiple algorithms and datasets, promoting reproducibility and benchmarking in the field.

Findings

01

Algorithms in onssen achieve reported performances

02

Supports most time-frequency mask-based methods

03

Enables easy comparison across algorithms and datasets

Abstract

Speech separation is an essential task for multi-talker speech recognition. Recently many deep learning approaches are proposed and have been constantly refreshing the state-of-the-art performances. The lack of algorithm implementations limits researchers to use the same dataset for comparison. Building a generic platform can benefit researchers by easily implementing novel separation algorithms and comparing them with the existing ones on customized datasets. We introduce "onssen": an open-source speech separation and enhancement library. onssen is a library mainly for deep learning separation and enhancement algorithms. It uses LibRosa and NumPy libraries for the feature extraction and PyTorch as the back-end for model training. onssen supports most of the Time-Frequency mask-based separation algorithms (e.g. deep clustering, chimera net, chimera++, and so on) and also supports…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

speechLabBcCuny/onssen
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Speech Recognition and Synthesis