# Towards joint sound scene and polyphonic sound event recognition

**Authors:** Helen L. Bear, Ines Nolasco, Emmanouil Benetos

arXiv: 1904.10408 · 2019-07-02

## TL;DR

This paper introduces a new dataset and a joint classification method for sound scene and event recognition, demonstrating improved learning efficiency and robust sound event detection in skewed datasets.

## Contribution

It presents a novel joint approach and a new dataset for simultaneous sound scene classification and sound event detection, advancing integrated acoustic scene analysis.

## Key findings

- Joint approach improves learning efficiency
- Sound event detection is robust despite skewed data
- Dataset enables combined scene and event analysis

## Abstract

Acoustic Scene Classification (ASC) and Sound Event Detection (SED) are two separate tasks in the field of computational sound scene analysis. In this work, we present a new dataset with both sound scene and sound event labels and use this to demonstrate a novel method for jointly classifying sound scenes and recognizing sound events. We show that by taking a joint approach, learning is more efficient and whilst improvements are still needed for sound event detection, SED results are robust in a dataset where the sample distribution is skewed towards sound scenes.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.10408/full.md

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/1904.10408/full.md

## References

22 references — full list in the complete paper: https://tomesphere.com/paper/1904.10408/full.md

---
Source: https://tomesphere.com/paper/1904.10408