# Sound source detection, localization and classification using   consecutive ensemble of CRNN models

**Authors:** S{\l}awomir Kapka, Mateusz Lewandowski

arXiv: 1908.00766 · 2019-10-31

## TL;DR

This paper presents a novel ensemble of CRNN models for sound event localization and detection, effectively estimating active sources, their directions, and classes, demonstrating competitive performance on a benchmark dataset.

## Contribution

It introduces a consecutive ensemble approach with four CRNN models to improve sound source detection, localization, and classification in a unified framework.

## Key findings

- Achieved accurate localization and classification on TAU dataset
- Outperformed some existing methods in DCASE2019 task3
- Demonstrated the effectiveness of consecutive ensemble models

## Abstract

In this paper, we describe our method for DCASE2019 task3: Sound Event Localization and Detection (SELD). We use four CRNN SELDnet-like single output models which run in a consecutive manner to recover all possible information of occurring events. We decompose the SELD task into estimating number of active sources, estimating direction of arrival of a single source, estimating direction of arrival of the second source where the direction of the first one is known and a multi-label classification task. We use custom consecutive ensemble to predict events' onset, offset, direction of arrival and class. The proposed approach is evaluated on the TAU Spatial Sound Events 2019 - Ambisonic and it is compared with other participants' submissions.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1908.00766/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/1908.00766/full.md

## References

10 references — full list in the complete paper: https://tomesphere.com/paper/1908.00766/full.md

---
Source: https://tomesphere.com/paper/1908.00766