The Voice Timbre Attribute Detection 2025 Challenge Evaluation Plan

Zhengyan Sheng; Jinghao He; Liping Chen; Kong Aik Lee; Zhen-Hua Ling

arXiv:2505.09382·cs.SD·June 24, 2025

The Voice Timbre Attribute Detection 2025 Challenge Evaluation Plan

Zhengyan Sheng, Jinghao He, Liping Chen, Kong Aik Lee, Zhen-Hua Ling

PDF

Open Access

TL;DR

The VtaD 2025 challenge aims to evaluate methods for explaining voice timbre attributes through comparative analysis using sensory descriptors, advancing understanding of voice quality characterization.

Contribution

This paper presents the evaluation plan for a new challenge focused on explaining voice timbre attributes via comparative analysis with sensory descriptors.

Findings

01

Challenge launched in May 2025

02

Evaluation at NCMMSC2025 conference in October 2025

03

Focus on explaining voice timbre through comparative descriptors

Abstract

Voice timbre refers to the unique quality or character of a person's voice that distinguishes it from others as perceived by human hearing. The Voice Timbre Attribute Detection (VtaD) 2025 challenge focuses on explaining the voice timbre attribute in a comparative manner. In this challenge, the human impression of voice timbre is verbalized with a set of sensory descriptors, including bright, coarse, soft, magnetic, and so on. The timbre is explained from the comparison between two voices in their intensity within a specific descriptor dimension. The VtaD 2025 challenge starts in May and culminates in a special proposal at the NCMMSC2025 conference in October 2025 in Zhenjiang, China.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Speech and Audio Processing · Speech Recognition and Synthesis

MethodsSparse Evolutionary Training