Acoustic Scene Clustering Using Joint Optimization of Deep Embedding   Learning and Clustering Iteration

Yanxiong Li; Mingle Liu; Wucheng Wang; Yuhan Zhang; Qianhua He

arXiv:2306.05621·eess.AS·June 12, 2023·1 cites

Acoustic Scene Clustering Using Joint Optimization of Deep Embedding Learning and Clustering Iteration

Yanxiong Li, Mingle Liu, Wucheng Wang, Yuhan Zhang, Qianhua He

PDF

Open Access

TL;DR

This paper introduces a novel acoustic scene clustering method that jointly optimizes deep feature learning and clustering, achieving superior results over existing unsupervised approaches.

Contribution

It proposes a unified framework combining deep CNN-based feature extraction with hierarchical clustering, optimized through a joint loss function.

Findings

01

Outperforms other unsupervised methods in clustering accuracy

02

Deep embedding features outperform state-of-the-art features

03

Unified optimization improves clustering performance

Abstract

Recent efforts have been made on acoustic scene classification in the audio signal processing community. In contrast, few studies have been conducted on acoustic scene clustering, which is a newly emerging problem. Acoustic scene clustering aims at merging the audio recordings of the same class of acoustic scene into a single cluster without using prior information and training classifiers. In this study, we propose a method for acoustic scene clustering that jointly optimizes the procedures of feature learning and clustering iteration. In the proposed method, the learned feature is a deep embedding that is extracted from a deep convolutional neural network (CNN), while the clustering algorithm is the agglomerative hierarchical clustering (AHC). We formulate a unified loss function for integrating and optimizing these two procedures. Various features and methods are compared. The…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Diverse Musicological Studies