The ICME 2025 Audio Encoder Capability Challenge
Junbo Zhang, Heinrich Dinkel, Qiong Song, Helen Wang, Yadong Niu, Si, Cheng, Xiaofeng Xin, Ke Li, Wenwu Wang, Yujun Wang, Jian Luan

TL;DR
The ICME 2025 Audio Encoder Capability Challenge evaluates pre-trained audio encoders across diverse real-world tasks, promoting advancements in multi-task learning and practical usability.
Contribution
It introduces a standardized challenge with two evaluation tracks to benchmark and improve audio encoder capabilities in various real-world scenarios.
Findings
Benchmarking of audio encoders across multiple tasks
Comparison of parameterized and parameter-free evaluation methods
Promotion of advancements in audio encoder design
Abstract
This challenge aims to evaluate the capabilities of audio encoders, especially in the context of multi-task learning and real-world applications. Participants are invited to submit pre-trained audio encoders that map raw waveforms to continuous embeddings. These encoders will be tested across diverse tasks including speech, environmental sounds, and music, with a focus on real-world usability. The challenge features two tracks: Track A for parameterized evaluation, and Track B for parameter-free evaluation. This challenge provides a platform for evaluating and advancing the state-of-the-art in audio encoder design.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsMusic and Audio Processing · Speech and Audio Processing · Music Technology and Sound Studies
