A Comparative Study of LLM Prompting and Fine-Tuning for Cross-genre Authorship Attribution on Chinese Lyrics

Yuxin Li; Lorraine Xu; Meng Fan Wang

arXiv:2511.21930·cs.CL·December 1, 2025

A Comparative Study of LLM Prompting and Fine-Tuning for Cross-genre Authorship Attribution on Chinese Lyrics

Yuxin Li, Lorraine Xu, Meng Fan Wang

PDF

Open Access

TL;DR

This study compares prompting and fine-tuning of large language models for Chinese lyric authorship attribution, revealing genre-dependent performance differences and establishing a new benchmark dataset for the domain.

Contribution

It introduces a new balanced Chinese lyric dataset, compares fine-tuning and zero-shot prompting, and highlights genre effects on attribution accuracy.

Findings

01

Structured genres improve attribution accuracy.

02

Fine-tuning enhances robustness in real-world data.

03

Genre sensitivity significantly impacts model performance.

Abstract

We propose a novel study on authorship attribution for Chinese lyrics, a domain where clean, public datasets are sorely lacking. Our contributions are twofold: (1) we create a new, balanced dataset of Chinese lyrics spanning multiple genres, and (2) we develop and fine-tune a domain-specific model, comparing its performance against zero-shot inference using the DeepSeek LLM. We test two central hypotheses. First, we hypothesize that a fine-tuned model will outperform a zero-shot LLM baseline. Second, we hypothesize that performance is genre-dependent. Our experiments strongly confirm Hypothesis 2: structured genres (e.g. Folklore & Tradition) yield significantly higher attribution accuracy than more abstract genres (e.g. Love & Romance). Hypothesis 1 receives only partial support: fine-tuning improves robustness and generalization in Test1 (real-world data and difficult genres), but…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Topic Modeling · Hate Speech and Cyberbullying Detection