The ICME 2025 Audio Encoder Capability Challenge

Junbo Zhang; Heinrich Dinkel; Qiong Song; Helen Wang; Yadong Niu; Si; Cheng; Xiaofeng Xin; Ke Li; Wenwu Wang; Yujun Wang; Jian Luan

arXiv:2501.15302·cs.SD·January 28, 2025

The ICME 2025 Audio Encoder Capability Challenge

Junbo Zhang, Heinrich Dinkel, Qiong Song, Helen Wang, Yadong Niu, Si, Cheng, Xiaofeng Xin, Ke Li, Wenwu Wang, Yujun Wang, Jian Luan

PDF

Open Access

TL;DR

The ICME 2025 Audio Encoder Capability Challenge evaluates pre-trained audio encoders across diverse real-world tasks, promoting advancements in multi-task learning and practical usability.

Contribution

It introduces a standardized challenge with two evaluation tracks to benchmark and improve audio encoder capabilities in various real-world scenarios.

Findings

01

Benchmarking of audio encoders across multiple tasks

02

Comparison of parameterized and parameter-free evaluation methods

03

Promotion of advancements in audio encoder design

Abstract

This challenge aims to evaluate the capabilities of audio encoders, especially in the context of multi-task learning and real-world applications. Participants are invited to submit pre-trained audio encoders that map raw waveforms to continuous embeddings. These encoders will be tested across diverse tasks including speech, environmental sounds, and music, with a focus on real-world usability. The challenge features two tracks: Track A for parameterized evaluation, and Track B for parameter-free evaluation. This challenge provides a platform for evaluating and advancing the state-of-the-art in audio encoder design.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies