Label Tree Embeddings for Acoustic Scene Classification
Huy Phan, Lars Hertel, Marco Maass, Philipp Koch, Alfred Mertins

TL;DR
This paper introduces a novel label tree embedding method for acoustic scene classification that leverages class label structures to improve feature representation, achieving state-of-the-art results on multiple datasets.
Contribution
It proposes an automatic taxonomy learning and embedding approach that enhances acoustic scene classification performance.
Findings
Achieved state-of-the-art results on DCASE 2013 dataset.
Achieved state-of-the-art results on LITIS Rouen dataset.
Demonstrated the effectiveness of label tree embeddings in acoustic scene classification.
Abstract
We present in this paper an efficient approach for acoustic scene classification by exploring the structure of class labels. Given a set of class labels, a category taxonomy is automatically learned by collectively optimizing a clustering of the labels into multiple meta-classes in a tree structure. An acoustic scene instance is then embedded into a low-dimensional feature representation which consists of the likelihoods that it belongs to the meta-classes. We demonstrate state-of-the-art results on two different datasets for the acoustic scene classification task, including the DCASE 2013 and LITIS Rouen datasets.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
