K-pop Lyric Translation: Dataset, Analysis, and Neural-Modelling
Haven Kim, Jongmin Jung, Dasaem Jeong, and Juhan Nam

TL;DR
This paper introduces a new dataset of K-pop lyric translations, analyzes its unique features, and develops a neural model for singable lyric translation, addressing gaps in genre and language coverage.
Contribution
It provides the first publicly available K-pop lyric translation dataset and demonstrates its use in analyzing and modeling singable lyric translation.
Findings
K-pop lyrics have distinct translation characteristics.
The dataset enables better understanding of genre-specific translation features.
A neural translation model trained on this dataset shows promising results.
Abstract
Lyric translation, a field studied for over a century, is now attracting computational linguistics researchers. We identified two limitations in previous studies. Firstly, lyric translation studies have predominantly focused on Western genres and languages, with no previous study centering on K-pop despite its popularity. Second, the field of lyric translation suffers from a lack of publicly available datasets; to the best of our knowledge, no such dataset exists. To broaden the scope of genres and languages in lyric translation studies, we introduce a novel singable lyric translation dataset, approximately 89\% of which consists of K-pop song lyrics. This dataset aligns Korean and English lyrics line-by-line and section-by-section. We leveraged this dataset to unveil unique characteristics of K-pop lyric translation, distinguishing it from other extensively studied genres, and to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAsian Culture and Media Studies
