# Modelling of Sound Events with Hidden Imbalances Based on Clustering and   Separate Sub-Dictionary Learning

**Authors:** Chaitanya Narisetty, Tatsuya Komatsu, Reishi Kondo

arXiv: 1904.02852 · 2019-04-08

## TL;DR

This paper introduces a novel sound event modeling approach that addresses hidden data imbalances by using clustering and sub-dictionary learning, significantly improving detection accuracy.

## Contribution

It presents a new method combining clustering and non-negative matrix factorization to model sound events with limited data, reducing over-fitting and handling data imbalance.

## Key findings

- Achieved 46.5% F-measure on DCASE 2013 dataset.
- Improved detection performance by over 19% compared to state-of-the-art.
- Effectively models limited data-sizes with separate sub-dictionaries.

## Abstract

This paper proposes an effective modelling of sound event spectra with a hidden data-size-imbalance, for improved Acoustic Event Detection (AED). The proposed method models each event as an aggregated representation of a few latent factors, while conventional approaches try to find acoustic elements directly from the event spectra. In the method, all the latent factors across all events are assigned comparable importance and complexity to overcome the hidden imbalance of data-sizes in event spectra. To extract latent factors in each event, the proposed method employs clustering and performs non-negative matrix factorization to each latent factor, and learns its acoustic elements as a sub-dictionary. Separate sub-dictionary learning effectively models the acoustic elements with limited data-sizes and avoids over-fitting due to hidden imbalances in training data. For the task of polyphonic sound event detection from DCASE 2013 challenge, an AED based on the proposed modelling achieves a detection F-measure of 46.5%, a significant improvement of more than 19% as compared to the existing state-of-the-art methods.

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/1904.02852/full.md

## Figures

6 figures with captions in the complete paper: https://tomesphere.com/paper/1904.02852/full.md

## References

21 references — full list in the complete paper: https://tomesphere.com/paper/1904.02852/full.md

---
Source: https://tomesphere.com/paper/1904.02852