LIWhiz: A Non-Intrusive Lyric Intelligibility Prediction System for the Cadenza Challenge
Ram C. M. C. Shekar, Iv\'an L\'opez-Espejo

TL;DR
LIWhiz is a non-intrusive system that predicts lyric intelligibility using robust features and a trainable model, significantly outperforming baseline methods in the Cadenza Challenge.
Contribution
It introduces LIWhiz, a novel non-intrusive lyric intelligibility prediction system utilizing Whisper and a trainable back-end, achieving state-of-the-art results.
Findings
Achieves RMSE of 27.07%, 22.4% better than baseline.
Significantly improves normalized cross-correlation.
Demonstrates effectiveness on the Cadenza Lyric Intelligibility Prediction dataset.
Abstract
We present LIWhiz, a non-intrusive lyric intelligibility prediction system submitted to the ICASSP 2026 Cadenza Challenge. LIWhiz leverages Whisper for robust feature extraction and a trainable back-end for score prediction. Tested on the Cadenza Lyric Intelligibility Prediction (CLIP) evaluation set, LIWhiz achieves a root mean square error (RMSE) of 27.07%, a 22.4% relative RMSE reduction over the STOI-based baseline, yielding a substantial improvement in normalized cross-correlation.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVoice and Speech Disorders · Speech Recognition and Synthesis · Music and Audio Processing
