TL;DR
This paper introduces a language identification approach using pre-trained ECAPA-TDNN models and margin loss to improve discriminability, achieving significant accuracy gains on the TidyLang Challenge 2026 dataset.
Contribution
It presents a novel combination of pre-trained models and margin-based losses for speaker-controlled language ID, outperforming baseline methods.
Findings
Achieved 85.95% macro accuracy and 90.96% micro accuracy.
Reduced EER to 17.08%, halving the error rate.
Significantly outperformed the official baseline.
Abstract
For the speaker-controlled spoken language identification task proposed in the TidyLang Challenge 2026, this paper proposes a language identification method based on pre-trained models and margin-based losses. The proposed method adopts a pre-trained ECAPA-TDNN as the feature encoder and incorporates margin-based losses to enhance the discriminative ability of language representations, thereby improving inter-class separability and reducing the interference of non-linguistic factors such as speaker characteristics. Experimental results on the Tidy-X dataset show that the proposed method achieves 85.95% macro accuracy and 90.96% micro accuracy on the language identification task and 17.08% equal error rate (EER) on the verification task. Compared with the official baseline, the macro accuracy improves by 45.7%, the micro accuracy improves by 15.2%, and the EER is reduced by approximately…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
