SemanticAC: Semantics-Assisted Framework for Audio Classification
Yicheng Xiao, Yue Ma, Shuyan Li, Hantao Zhou, Ran Liao and, Xiu Li

TL;DR
SemanticAC introduces a semantics-assisted framework that leverages language models to extract semantic information from class labels, improving audio classification accuracy by aligning audio signals with label semantics.
Contribution
It is the first to incorporate language model-derived semantics into audio classification, enhancing performance over traditional label treatment methods.
Findings
Outperforms existing methods on ESC-50 and US8K datasets.
Utilizes semantic consistency to improve classification accuracy.
Employs a text encoder and similarity module for semantic alignment.
Abstract
In this paper, we propose SemanticAC, a semantics-assisted framework for Audio Classification to better leverage the semantic information. Unlike conventional audio classification methods that treat class labels as discrete vectors, we employ a language model to extract abundant semantics from labels and optimize the semantic consistency between audio signals and their labels. We verify that simple textual information from labels and advanced pretraining models enable more abundant semantic supervision for better performance. Specifically, we design a text encoder to capture the semantic information from the text extension of labels. Then we map the audio signals to align with the semantics of corresponding class labels via an audio encoder and a similarity calculation module so as to enforce the semantic consistency. Extensive experiments on two audio datasets, ESC-50 and US8K…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Digital Media Forensic Detection · Music Technology and Sound Studies
MethodsALIGN
