Cross-Corpora Language Recognition: A Preliminary Investigation with   Indian Languages

Spandan Dey; Goutam Saha; Md Sahidullah

arXiv:2105.04639·eess.AS·May 13, 2021

Cross-Corpora Language Recognition: A Preliminary Investigation with Indian Languages

Spandan Dey, Goutam Saha, Md Sahidullah

PDF

Open Access

TL;DR

This study evaluates the performance of spoken language identification systems across different Indian language corpora, highlighting significant performance drops due to corpus mismatch and demonstrating that feature normalization can improve cross-corpora accuracy.

Contribution

First investigation into cross-corpora evaluation for Indian spoken language identification, analyzing mismatch issues and applying feature normalization to enhance performance.

Findings

01

Cross-corpora performance degrades significantly.

02

Feature normalization improves cross-corpora LID accuracy.

03

Significant differences in LTAS and SNR among corpora.

Abstract

In this paper, we conduct one of the very first studies for cross-corpora performance evaluation in the spoken language identification (LID) problem. Cross-corpora evaluation was not explored much in LID research, especially for the Indian languages. We have selected three Indian spoken language corpora: IIITH-ILSC, LDC South Asian, and IITKGP-MLILSC. For each of the corpus, LID systems are trained on the state-of-the-art time-delay neural network (TDNN) based architecture with MFCC features. We observe that the LID performance degrades drastically for cross-corpora evaluation. For example, the system trained on the IIITH-ILSC corpus shows an average EER of 11.80 % and 43.34 % when evaluated with the same corpora and LDC South Asian corpora, respectively. Our preliminary analysis shows the significant differences among these corpora in terms of mismatch in the long-term average spectrum…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Music and Audio Processing