Angular Distance Distribution Loss for Audio Classification
Antonio Almud\'evar, Romain Serizel, Alfonso Ortega

TL;DR
This paper introduces the Angular Distance Distribution (ADD) Loss, a novel loss function that improves embedding properties by considering intra-class, inter-class, and hierarchical relationships, leading to better audio classification performance.
Contribution
The paper proposes the ADD Loss, which jointly optimizes intra-class, inter-class, and hierarchical properties of embeddings using angular distance statistics.
Findings
ADD Loss enhances embedding properties in audio classification.
It outperforms existing loss functions in experiments.
Improves intra-class, inter-class, and hierarchical separation.
Abstract
Classification is a pivotal task in deep learning not only because of its intrinsic importance, but also for providing embeddings with desirable properties in other tasks. To optimize these properties, a wide variety of loss functions have been proposed that attempt to minimize the intra-class distance and maximize the inter-class distance in the embeddings space. In this paper we argue that, in addition to these two, eliminating hierarchies within and among classes are two other desirable properties for classification embeddings. Furthermore, we propose the Angular Distance Distribution (ADD) Loss, which aims to enhance the four previous properties jointly. For this purpose, it imposes conditions on the first and second order statistical moments of the angular distance between embeddings. Finally, we perform experiments showing that our loss function improves all four properties and,…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis
