Enhancing Neural Spoken Language Recognition: An Exploration with   Multilingual Datasets

Or Haim Anidjar; Roi Yozevitch

arXiv:2501.11065·cs.SD·January 22, 2025

Enhancing Neural Spoken Language Recognition: An Exploration with Multilingual Datasets

Or Haim Anidjar, Roi Yozevitch

PDF

Open Access

TL;DR

This paper presents a novel multilingual neural spoken language recognition system that employs optimized Time Delay Neural Networks with a specialized pooling layer, achieving 97% accuracy across ten diverse languages.

Contribution

It introduces an improved TDN architecture with a funnel shape and extensive hyperparameter tuning, advancing multilingual speech recognition capabilities.

Findings

01

Achieved 97% language recognition accuracy.

02

Enhanced TDN architecture with additional layers and funnel structure.

03

Effective use of augmented data for training.

Abstract

In this research, we advanced a spoken language recognition system, moving beyond traditional feature vector-based models. Our improvements focused on effectively capturing language characteristics over extended periods using a specialized pooling layer. We utilized a broad dataset range from Common-Voice, targeting ten languages across Indo-European, Semitic, and East Asian families. The major innovation involved optimizing the architecture of Time Delay Neural Networks. We introduced additional layers and restructured these networks into a funnel shape, enhancing their ability to process complex linguistic patterns. A rigorous grid search determined the optimal settings for these networks, significantly boosting their efficiency in language pattern recognition from audio samples. The model underwent extensive training, including a phase with augmented data, to refine its capabilities.…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNeural Networks and Applications