Angular Distance Distribution Loss for Audio Classification

Antonio Almud\'evar; Romain Serizel; Alfonso Ortega

arXiv:2411.00153·cs.SD·November 4, 2024

Angular Distance Distribution Loss for Audio Classification

Antonio Almud\'evar, Romain Serizel, Alfonso Ortega

PDF

Open Access

TL;DR

This paper introduces the Angular Distance Distribution (ADD) Loss, a novel loss function that improves embedding properties by considering intra-class, inter-class, and hierarchical relationships, leading to better audio classification performance.

Contribution

The paper proposes the ADD Loss, which jointly optimizes intra-class, inter-class, and hierarchical properties of embeddings using angular distance statistics.

Findings

01

ADD Loss enhances embedding properties in audio classification.

02

It outperforms existing loss functions in experiments.

03

Improves intra-class, inter-class, and hierarchical separation.

Abstract

Classification is a pivotal task in deep learning not only because of its intrinsic importance, but also for providing embeddings with desirable properties in other tasks. To optimize these properties, a wide variety of loss functions have been proposed that attempt to minimize the intra-class distance and maximize the inter-class distance in the embeddings space. In this paper we argue that, in addition to these two, eliminating hierarchies within and among classes are two other desirable properties for classification embeddings. Furthermore, we propose the Angular Distance Distribution (ADD) Loss, which aims to enhance the four previous properties jointly. For this purpose, it imposes conditions on the first and second order statistical moments of the angular distance between embeddings. Finally, we perform experiments showing that our loss function improves all four properties and,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis