Automatic Speech Recognition for Sanskrit with Transfer Learning

Bidit Sadhukhan; Swami Punyeshwarananda

arXiv:2501.10024·cs.CL·January 20, 2025

Automatic Speech Recognition for Sanskrit with Transfer Learning

Bidit Sadhukhan, Swami Punyeshwarananda

PDF

TL;DR

This paper presents a Sanskrit speech recognition system using transfer learning on OpenAI's Whisper, achieving promising accuracy and enhancing digital accessibility for this ancient language.

Contribution

It introduces a transfer learning approach on Whisper for Sanskrit ASR, optimizing hyper-parameters to improve performance on limited data.

Findings

01

Achieved a 15.42% WER on Vaksancayah dataset.

02

Developed an accessible online demo for Sanskrit speech recognition.

03

Enhanced technological support for Sanskrit language learning.

Abstract

Sanskrit, one of humanity's most ancient languages, has a vast collection of books and manuscripts on diverse topics that have been accumulated over millennia. However, its digital content (audio and text), which is vital for the training of AI systems, is profoundly limited. Furthermore, its intricate linguistics make it hard to develop robust NLP tools for wider accessibility. Given these constraints, we have developed an automatic speech recognition model for Sanskrit by employing transfer learning mechanism on OpenAI's Whisper model. After carefully optimising the hyper-parameters, we obtained promising results with our transfer-learned model achieving a word error rate of 15.42% on Vaksancayah dataset. An online demo of our model is made available for the use of public and to evaluate its performance firsthand thereby paving the way for improved accessibility and technological…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.