Triage knowledge distillation for speaker verification
Ju-ho Kim, Youngmoon Jung, Joon-Young Yang, Jaeyoung Roh, Chang Woo Han, Hoon-Young Cho

TL;DR
This paper introduces Triage Knowledge Distillation (TRKD), a novel method for speaker verification that improves model performance by selectively transferring relational information based on example difficulty, especially effective on resource-limited devices.
Contribution
TRKD operationalizes assess-prioritize-focus in knowledge distillation, using a cumulative-probability cutoff to focus on confusable classes and progressively refine learning, outperforming existing KD methods.
Findings
TRKD achieves the lowest EER across all protocols on VoxCeleb1.
TRKD outperforms recent KD variants in speaker verification tasks.
TRKD effectively handles large-class settings with long-tail distributions.
Abstract
Deploying speaker verification on resource-constrained devices remains challenging due to the computational cost of high-capacity models; knowledge distillation (KD) offers a remedy. Classical KD entangles target confidence with non-target structure in a Kullback-Leibler term, limiting the transfer of relational information. Decoupled KD separates these signals into target and non-target terms, yet treats non-targets uniformly and remains vulnerable to the long tail of low-probability classes in large-class settings. We introduce Triage KD (TRKD), a distillation scheme that operationalizes assess-prioritize-focus. TRKD introduces a cumulative-probability cutoff to assess per-example difficulty and partition the teacher posterior into three groups: the target class, a high-probability non-target confusion-set, and a background-set. To prioritize informative signals, TRKD distills…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Machine Learning and Algorithms
