VietLyrics: A Large-Scale Dataset and Models for Vietnamese Automatic Lyrics Transcription

Quoc Anh Nguyen; Bernard Cheng; Kelvin Soh

arXiv:2510.22295·cs.AI·December 23, 2025

VietLyrics: A Large-Scale Dataset and Models for Vietnamese Automatic Lyrics Transcription

Quoc Anh Nguyen, Bernard Cheng, Kelvin Soh

PDF

3 Models

TL;DR

This paper introduces VietLyrics, the first large-scale Vietnamese lyrics dataset, and demonstrates how fine-tuning Whisper models on this data improves automatic lyrics transcription for Vietnamese music.

Contribution

The creation of the first large-scale Vietnamese ALT dataset and the demonstration of fine-tuned Whisper models outperforming existing systems.

Findings

01

Fine-tuned Whisper models achieve better transcription accuracy.

02

Current ASR approaches face significant errors and hallucinations.

03

VietLyrics dataset enables research in low-resource language ALT.

Abstract

Automatic Lyrics Transcription (ALT) for Vietnamese music presents unique challenges due to its tonal complexity and dialectal variations, but remains largely unexplored due to the lack of a dedicated dataset. Therefore, we curated the first large-scale Vietnamese ALT dataset (VietLyrics), comprising 647 hours of songs with line-level aligned lyrics and metadata to address these issues. Our evaluation of current ASRbased approaches reveal significant limitations, including frequent transcription errors and hallucinations in non-vocal segments. To improve performance, we fine-tuned Whisper models on the VietLyrics dataset, achieving superior results compared to existing multilingual ALT systems, including LyricWhiz. We publicly release VietLyrics and our models, aiming to advance Vietnamese music computing research while demonstrating the potential of this approach for ALT in…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.