Class-Incremental Learning for Sound Event Localization and Detection

Ruchi Pandey; Manjunath Mulimani; Archontis Politis; Annamaria Mesaros

arXiv:2411.12830·eess.AS·November 21, 2024·ICASSP

Class-Incremental Learning for Sound Event Localization and Detection

Ruchi Pandey, Manjunath Mulimani, Archontis Politis, Annamaria Mesaros

PDF

Open Access

TL;DR

This paper explores class-incremental learning for sound event localization and detection, proposing a method that learns new sound classes while retaining previous knowledge, validated on a realistic spatial sound dataset.

Contribution

It introduces a novel incremental learning approach using distillation loss for SELD tasks, enabling effective learning of new classes without forgetting old ones.

Findings

01

Maintains baseline performance across all classes after incremental learning.

02

Successfully learns new sound classes without significant performance degradation.

03

Validated on the TAU-NIGENS Spatial Sound Events 2021 dataset.

Abstract

This paper investigates the feasibility of class-incremental learning (CIL) for Sound Event Localization and Detection (SELD) tasks. The method features an incremental learner that can learn new sound classes independently while preserving knowledge of old classes. The continual learning is achieved through a mean square error-based distillation loss to minimize output discrepancies between subsequent learners. The experiments are conducted on the TAU-NIGENS Spatial Sound Events 2021 dataset, which includes 12 different sound classes and demonstrate the efficacy of proposed method. We begin by learning 8 classes and introduce the 4 new classes at next stage. After the incremental phase, the system is evaluated on the full set of learned classes. Results show that, for this realistic dataset, our proposed method successfully maintains baseline performance across all metrics.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis

MethodsSparse Evolutionary Training