# Biomimetic Computing for Efficient Spoken Language Identification

**Authors:** Gaurav Kumar, Saurabh Bhardwaj

PMC · DOI: 10.3390/biomimetics10050316 · Biomimetics · 2025-05-14

## TL;DR

This paper introduces a new biomimetic computing method for spoken language identification that achieves high accuracy using beetle-inspired optimization and deep learning.

## Contribution

The novel DBODL-MSLIS approach combines Dung Beetle Optimization with deep learning for improved spoken language classification.

## Key findings

- The DBODL-MSLIS method achieved 95.54% accuracy on the IIIT Spoken Language dataset.
- Bayesian Optimization with LSTM outperformed other biomimetic models like ACO, reaching 95.55% accuracy.
- The technique surpassed state-of-the-art models like SVM, MLP, and VGG-16 in language identification performance.

## Abstract

Spoken Language Identification (SLID)-based applications have become increasingly important in everyday life, driven by advancements in artificial intelligence and machine learning. Multilingual countries utilize the SLID method to facilitate speech detection. This is accomplished by determining the language of the spoken parts using language recognizers. On the other hand, when working with multilingual datasets, the presence of multiple languages that have a shared origin presents a significant challenge for accurately classifying languages using automatic techniques. Further, one more challenge is the significant variance in speech signals caused by factors such as different speakers, content, acoustic settings, language differences, changes in voice modulation based on age and gender, and variations in speech patterns. In this study, we introduce the DBODL-MSLIS approach, which integrates biomimetic optimization techniques inspired by natural intelligence to enhance language classification. The proposed method employs Dung Beetle Optimization (DBO) with Deep Learning, simulating the beetle’s foraging behavior to optimize feature selection and classification performance. The proposed technique integrates speech preprocessing, which encompasses pre-emphasis, windowing, and frame blocking, followed by feature extraction utilizing pitch, energy, Discrete Wavelet Transform (DWT), and Zero crossing rate (ZCR). Further, the selection of features is performed by DBO algorithm, which removes redundant features and helps to improve efficiency and accuracy. Spoken languages are classified using Bayesian optimization (BO) in conjunction with a long short-term memory (LSTM) network. The DBODL-MSLIS technique has been experimentally validated using the IIIT Spoken Language dataset. The results indicate an average accuracy of 95.54% and an F-score of 84.31%. This technique surpasses various other state-of-the-art models, such as SVM, MLP, LDA, DLA-ASLISS, HMHFS-IISLFAS, GA base fusion, and VGG-16. We have evaluated the accuracy of our proposed technique against state-of-the-art biomimetic computing models such as GA, PSO, GWO, DE, and ACO. While ACO achieved up to 89.45% accuracy, our Bayesian Optimization with LSTM outperformed all others, reaching a peak accuracy of 95.55%, demonstrating its effectiveness in enhancing spoken language identification. The suggested technique demonstrates promising potential for practical applications in the field of multi-lingual voice processing.

## Full-text entities

- **Diseases:** injury to (MESH:D014947), LID (MESH:D007806)
- **Chemicals:** DBO (-)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12108623/full.md

## Figures

10 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12108623/full.md

## References

37 references — full list in the complete paper: https://tomesphere.com/paper/PMC12108623/full.md

---
Source: https://tomesphere.com/paper/PMC12108623