Power Pooling Operators and Confidence Learning for Semi-Supervised   Sound Event Detection

Yuzhuo Liu; Hangting Chen; Pengyuan Zhang

arXiv:2005.11459·cs.SD·May 26, 2020·1 cites

Power Pooling Operators and Confidence Learning for Semi-Supervised Sound Event Detection

Yuzhuo Liu, Hangting Chen, Pengyuan Zhang

PDF

Open Access

TL;DR

This paper introduces a confidence learning method and a power pooling function for semi-supervised sound event detection, significantly improving accuracy and reducing error rates by leveraging confidence-weighted data and nonlinear pooling.

Contribution

It proposes a novel confidence-based weighting scheme and a trainable power pooling function, enhancing semi-supervised sound event detection performance.

Findings

01

Confidence correlates with prediction accuracy.

02

Power pooling outperforms linear pooling.

03

34% relative error rate reduction achieved.

Abstract

In recent years, the involvement of synthetic strongly labeled data,weakly labeled data and unlabeled data has drawn much research attentionin semi-supervised sound event detection (SSED). Self-training models carry out predictions without strong annotations and then take predictions with high probabilities as pseudo-labels for retraining. Such models have shown its effectiveness in SSED. However, probabilities are poorly calibrated confidence estimates, and samples with low probabilities are ignored. Hence, we introduce a method of learning confidence deliberately and retaining all data distinctly by applying confidence as weights. Additionally, linear pooling has been considered as a state-of-the-art aggregation function for SSED with weak labeling. In this paper, we propose a power pooling function whose coefficient can be trained automatically to achieve nonlinearity. A…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Animal Vocal Communication and Behavior