Timed text extraction from Taiwanese Kua-\'a-h\`i TV series

Tzu-Hung Huang; Yun-En Tsai; Yun-Ning Hung; Chih-Wei Wu; I-Chieh Wei; Li Su

arXiv:2601.00299·cs.SD·January 5, 2026

Timed text extraction from Taiwanese Kua-\'a-h\`i TV series

Tzu-Hung Huang, Yun-En Tsai, Yun-Ning Hung, Chih-Wei Wu, I-Chieh Wei, Li Su

PDF

Open Access

TL;DR

This paper presents an interactive system combining OCR correction and speech/music detection to efficiently extract vocal segments and lyrics from Taiwanese opera TV series, facilitating music information retrieval tasks.

Contribution

It introduces a novel two-step approach integrating OCR and SMAD for high-precision vocal segment identification in low-quality archival videos.

Findings

01

High-precision vocal segment detection achieved

02

Efficient extraction of lyrics and vocal segments

03

Supports MIR tasks like lyrics identification

Abstract

Taiwanese opera (Kua-\'a-h\`i), a major form of local theatrical tradition, underwent extensive television adaptation notably by pioneers like I\^unn L\=e-hua. These videos, while potentially valuable for in-depth studies of Taiwanese opera, often have low quality and require substantial manual effort during data preparation. To streamline this process, we developed an interactive system for real-time OCR correction and a two-step approach integrating OCR-driven segmentation with Speech and Music Activity Detection (SMAD) to efficiently identify vocal segments from archival episodes with high precision. The resulting dataset, consisting of vocal segments and corresponding lyrics, can potentially supports various MIR tasks such as lyrics identification and tune retrieval. Code is available at https://github.com/z-huang/ocr-subtitle-editor .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMusic and Audio Processing · Authorship Attribution and Profiling · Theater, Performance, and Music History