LIWhiz: A Non-Intrusive Lyric Intelligibility Prediction System for the Cadenza Challenge

Ram C. M. C. Shekar; Iv\'an L\'opez-Espejo

arXiv:2512.17937·eess.AS·February 2, 2026

LIWhiz: A Non-Intrusive Lyric Intelligibility Prediction System for the Cadenza Challenge

Ram C. M. C. Shekar, Iv\'an L\'opez-Espejo

PDF

Open Access

TL;DR

LIWhiz is a non-intrusive system that predicts lyric intelligibility using robust features and a trainable model, significantly outperforming baseline methods in the Cadenza Challenge.

Contribution

It introduces LIWhiz, a novel non-intrusive lyric intelligibility prediction system utilizing Whisper and a trainable back-end, achieving state-of-the-art results.

Findings

01

Achieves RMSE of 27.07%, 22.4% better than baseline.

02

Significantly improves normalized cross-correlation.

03

Demonstrates effectiveness on the Cadenza Lyric Intelligibility Prediction dataset.

Abstract

We present LIWhiz, a non-intrusive lyric intelligibility prediction system submitted to the ICASSP 2026 Cadenza Challenge. LIWhiz leverages Whisper for robust feature extraction and a trainable back-end for score prediction. Tested on the Cadenza Lyric Intelligibility Prediction (CLIP) evaluation set, LIWhiz achieves a root mean square error (RMSE) of 27.07%, a 22.4% relative RMSE reduction over the STOI-based baseline, yielding a substantial improvement in normalized cross-correlation.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsVoice and Speech Disorders · Speech Recognition and Synthesis · Music and Audio Processing