# Sound Event Detection with Sequentially Labelled Data Based on   Connectionist Temporal Classification and Unsupervised Clustering

**Authors:** Yuanbo Hou, Qiuqiang Kong, Shengchen Li, Mark D. Plumbley

arXiv: 1904.12102 · 2019-04-30

## TL;DR

This paper introduces a novel sound event detection system that leverages sequentially labelled data and combines CTC with unsupervised clustering, achieving high performance without requiring detailed event timing annotations.

## Contribution

It proposes a new two-stage SED method using sequentially labelled data with CTC and unsupervised clustering, outperforming weakly labelled systems and matching strongly labelled systems.

## Key findings

- Achieves comparable performance to strongly labelled data systems
- Outperforms weakly labelled data systems
- Effective without onset/offset time annotations

## Abstract

Sound event detection (SED) methods typically rely on either strongly labelled data or weakly labelled data. As an alternative, sequentially labelled data (SLD) was proposed. In SLD, the events and the order of events in audio clips are known, without knowing the occurrence time of events. This paper proposes a connectionist temporal classification (CTC) based SED system that uses SLD instead of strongly labelled data, with a novel unsupervised clustering stage. Experiments on 41 classes of sound events show that the proposed two-stage method trained on SLD achieves performance comparable to the previous state-of-the-art SED system trained on strongly labelled data, and is far better than another state-of-the-art SED system trained on weakly labelled data, which indicates the effectiveness of the proposed two-stage method trained on SLD without any onset/offset time of sound events.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.12102/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1904.12102/full.md

## References

27 references — full list in the complete paper: https://tomesphere.com/paper/1904.12102/full.md

---
Source: https://tomesphere.com/paper/1904.12102