# All-neural online source separation, counting, and diarization for   meeting analysis

**Authors:** Thilo von Neumann, Keisuke Kinoshita, Marc Delcroix, Shoko Araki,, Tomohiro Nakatani, Reinhold Haeb-Umbach

arXiv: 1902.07881 · 2019-02-22

## TL;DR

This paper introduces an all-neural, online approach for simultaneous speaker counting, diarization, and source separation in meeting analysis, achieving state-of-the-art results in a real-time, long-session setting.

## Contribution

It presents the first all-neural, block-online system that jointly performs speaker counting, diarization, and separation, with stable source tracking over time.

## Key findings

- Achieves state-of-the-art separation performance.
- Provides accurate speaker diarization and counting.
- Generalizes well to unseen long sessions.

## Abstract

Automatic meeting analysis comprises the tasks of speaker counting, speaker diarization, and the separation of overlapped speech, followed by automatic speech recognition. This all has to be carried out on arbitrarily long sessions and, ideally, in an online or block-online manner. While significant progress has been made on individual tasks, this paper presents for the first time an all-neural approach to simultaneous speaker counting, diarization and source separation. The NN-based estimator operates in a block-online fashion and tracks speakers even if they remain silent for a number of time blocks, thus learning a stable output order for the separated sources. The neural network is recurrent over time as well as over the number of sources. The simulation experiments show that state of the art separation performance is achieved, while at the same time delivering good diarization and source counting results. It even generalizes well to an unseen large number of blocks.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1902.07881/full.md

## Figures

2 figures with captions in the complete paper: https://tomesphere.com/paper/1902.07881/full.md

## References

25 references — full list in the complete paper: https://tomesphere.com/paper/1902.07881/full.md

---
Source: https://tomesphere.com/paper/1902.07881