Towards Robust Speech Recognition for Jamaican Patois Music Transcription

Jordan Madden; Matthew Stone; Dimitri Johnson; Daniel Geddez

arXiv:2507.16834·eess.AS·July 24, 2025

Towards Robust Speech Recognition for Jamaican Patois Music Transcription

Jordan Madden, Matthew Stone, Dimitri Johnson, Daniel Geddez

PDF

Open Access

TL;DR

This paper improves Jamaican Patois music transcription by creating a new dataset and fine-tuning ASR models, enhancing accessibility and advancing language modeling for this underrepresented language.

Contribution

The work introduces a curated dataset of 40+ hours of Patois music and demonstrates fine-tuning of state-of-the-art ASR models for better transcription accuracy.

Findings

01

Fine-tuned Whisper models show improved performance on Patois music

02

Scaling laws for Whisper models on Patois audio are developed

03

Enhanced accessibility for Jamaican Patois music through improved transcription

Abstract

Although Jamaican Patois is a widely spoken language, current speech recognition systems perform poorly on Patois music, producing inaccurate captions that limit accessibility and hinder downstream applications. In this work, we take a data-centric approach to this problem by curating more than 40 hours of manually transcribed Patois music. We use this dataset to fine-tune state-of-the-art automatic speech recognition (ASR) models, and use the results to develop scaling laws for the performance of Whisper models on Jamaican Patois audio. We hope that this work will have a positive impact on the accessibility of Jamaican Patois music and the future of Jamaican Patois language modeling.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsSpeech and Audio Processing · Music and Audio Processing · Phonetics and Phonology Research