Ontology-aware Learning and Evaluation for Audio Tagging

Haohe Liu; Qiuqiang Kong; Xubo Liu; Xinhao Mei; Wenwu Wang; Mark D.; Plumbley

arXiv:2211.12195·eess.AS·October 10, 2023

Ontology-aware Learning and Evaluation for Audio Tagging

Haohe Liu, Qiuqiang Kong, Xubo Liu, Xinhao Mei, Wenwu Wang, Mark D., Plumbley

PDF

Open Access 1 Repo

TL;DR

This paper introduces ontology-aware evaluation and training methods for audio tagging that incorporate sound class relationships, leading to more accurate and human-aligned performance assessments.

Contribution

It proposes OmAP, an ontology-aware metric, and OBCE, a loss function reweighted by ontology distance, enhancing evaluation robustness and model training for audio tagging.

Findings

01

OmAP aligns better with human perception than mAP.

02

OBCE improves mAP and OmAP scores.

03

Ontology information enhances audio tagging performance.

Abstract

This study defines a new evaluation metric for audio tagging tasks to overcome the limitation of the conventional mean average precision (mAP) metric, which treats different kinds of sound as independent classes without considering their relations. Also, due to the ambiguities in sound labeling, the labels in the training and evaluation set are not guaranteed to be accurate and exhaustive, which poses challenges for robust evaluation with mAP. The proposed metric, ontology-aware mean average precision (OmAP) addresses the weaknesses of mAP by utilizing the AudioSet ontology information during the evaluation. Specifically, we reweight the false positive events in the model prediction based on the ontology graph distance to the target classes. The OmAP measure also provides more insights into model performance by evaluations with different coarse-grained levels in the ontology graph. We…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

haoheliu/ontology-aware-audio-tagging
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech Recognition and Synthesis · Speech and Audio Processing

MethodsOntology