Hybrid ASR for Resource-Constrained Robots: HMM - Deep Learning Fusion

Anshul Ranjan; Kaushik Jegadeesan

arXiv:2309.07164·eess.AS·December 25, 2024

Hybrid ASR for Resource-Constrained Robots: HMM - Deep Learning Fusion

Anshul Ranjan, Kaushik Jegadeesan

PDF

Open Access 1 Repo

TL;DR

This paper introduces a hybrid ASR system combining HMMs and deep learning for resource-limited robots, enabling real-time, accurate speech recognition through distributed processing and adaptability to various environments.

Contribution

It presents a novel hybrid architecture utilizing socket programming to distribute ASR processing between robot and PC, tailored for low-resource robotic platforms.

Findings

01

Enhanced speech recognition accuracy in resource-constrained environments

02

Real-time processing demonstrated on multiple robotic platforms

03

System adapts to changing acoustic conditions

Abstract

This paper presents a novel hybrid Automatic Speech Recognition (ASR) system designed specifically for resource-constrained robots. The proposed approach combines Hidden Markov Models (HMMs) with deep learning models and leverages socket programming to distribute processing tasks effectively. In this architecture, the HMM-based processing takes place within the robot, while a separate PC handles the deep learning model. This synergy between HMMs and deep learning enhances speech recognition accuracy significantly. We conducted experiments across various robotic platforms, demonstrating real-time and precise speech recognition capabilities. Notably, the system exhibits adaptability to changing acoustic conditions and compatibility with low-power hardware, making it highly effective in environments with limited computational resources. This hybrid ASR paradigm opens up promising…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

anshulranjan2004/pyhmm
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech Recognition and Synthesis · Speech and Audio Processing · Speech and dialogue systems