DAME: Duration-Aware Matryoshka Embedding for Duration-Robust Speaker Verification
Youngmoon Jung, Joon-Young Yang, Ju-ho Kim, Jaeyoung Roh, Chang Woo Han, Hoon-Young Cho

TL;DR
DAME introduces a duration-aware hierarchical embedding framework for speaker verification, improving accuracy on short utterances without extra inference cost by capturing duration-specific speaker traits.
Contribution
The paper proposes a novel nested hierarchy embedding approach that adapts to utterance duration, enhancing short-utterance speaker verification performance.
Findings
Reduces EER on short-duration trials across multiple datasets.
Maintains full-length performance without additional inference cost.
Works with various encoder architectures and training setups.
Abstract
Short-utterance speaker verification remains challenging due to limited speaker-discriminative cues in short speech segments. While existing methods focus on enhancing speaker encoders, the embedding learning strategy still forces a single fixed-dimensional representation reused for utterances of any length, leaving capacity misaligned with the information available at different durations. We propose Duration-Aware Matryoshka Embedding (DAME), a model-agnostic framework that builds a nested hierarchy of sub-embeddings aligned to utterance durations: lower-dimensional representations capture compact speaker traits from short utterances, while higher dimensions encode richer details from longer speech. DAME supports both training from scratch and fine-tuning, and serves as a direct alternative to conventional large-margin fine-tuning, consistently improving performance across durations.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Voice and Speech Disorders
