Optimizing Neural Architectures for Hindi Speech Separation and Enhancement in Noisy Environments

Arnav Ramamoorthy

arXiv:2508.12009·cs.SD·August 25, 2025

Optimizing Neural Architectures for Hindi Speech Separation and Enhancement in Noisy Environments

Arnav Ramamoorthy

PDF

TL;DR

This paper presents a neural network-based approach for Hindi speech separation and enhancement tailored for edge devices, achieving improved clarity and intelligibility in noisy environments through model fine-tuning and quantization.

Contribution

It introduces a refined DEMUCS-based model with U-Net and LSTM layers, optimized for resource-constrained devices, and trained on a large, diverse Hindi speech dataset.

Findings

01

Superior PESQ and STOI scores under noisy conditions

02

Effective quantization for edge device deployment

03

Enhanced speech clarity in extreme noise environments

Abstract

This paper addresses the challenges of Hindi speech separation and enhancement using advanced neural network architectures, with a focus on edge devices. We propose a refined approach leveraging the DEMUCS model to overcome limitations of traditional methods, achieving substantial improvements in speech clarity and intelligibility. The model is fine-tuned with U-Net and LSTM layers, trained on a dataset of 400,000 Hindi speech clips augmented with ESC-50 and MS-SNSD for diverse acoustic environments. Evaluation using PESQ and STOI metrics shows superior performance, particularly under extreme noise conditions. To ensure deployment on resource-constrained devices like TWS earbuds, we explore quantization techniques to reduce computational requirements. This research highlights the effectiveness of customized AI algorithms for speech processing in Indian contexts and suggests future…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.