Connectionist Temporal Localization for Sound Event Detection with   Sequential Labeling

Yun Wang; Florian Metze

arXiv:1810.09052·cs.SD·February 20, 2019

Connectionist Temporal Localization for Sound Event Detection with Sequential Labeling

Yun Wang, Florian Metze

PDF

2 Repos

TL;DR

This paper introduces connectionist temporal localization (CTL), a novel adaptation of CTC for sound event detection with sequential labeling, effectively capturing event boundaries and improving localization accuracy over traditional methods.

Contribution

The paper proposes CTL, a new framework that addresses peak clustering in CTC for sequential labeling in sound event detection, enabling better temporal localization.

Findings

01

CTL reduces the gap between weak and strong labeling performance.

02

Evaluation on Audio Set shows significant localization improvements.

03

CTL facilitates combining different labeling types effectively.

Abstract

Research on sound event detection (SED) with weak labeling has mostly focused on presence/absence labeling, which provides no temporal information at all about the event occurrences. In this paper, we consider SED with sequential labeling, which specifies the temporal order of the event boundaries. The conventional connectionist temporal classification (CTC) framework, when applied to SED with sequential labeling, does not localize long events well due to a "peak clustering" problem. We adapt the CTC framework and propose connectionist temporal localization (CTL), which successfully solves the problem. Evaluation on a subset of Audio Set shows that CTL closes a third of the gap between presence/ absence labeling and strong labeling, demonstrating the usefulness of the extra temporal information in sequential labeling. CTL also makes it easy to combine sequential labeling with…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.