HLPD: Aligning LLMs to Human Language Preference for Machine-Revised Text Detection

Fangqi Dai; Xingjian Jiang; Zizhuang Deng

arXiv:2511.06942·cs.CL·December 3, 2025

HLPD: Aligning LLMs to Human Language Preference for Machine-Revised Text Detection

Fangqi Dai, Xingjian Jiang, Zizhuang Deng

PDF

Open Access

TL;DR

This paper introduces HLPD, a novel method that aligns language models to human writing styles to improve detection of machine-revised texts, especially against advanced LLM outputs and adversarial revisions.

Contribution

HLPD employs a reward-based alignment process to enhance the sensitivity of detection models to human-like writing, addressing limitations of previous methods in adversarial scenarios.

Findings

01

HLPD improves AUROC by 15.11% over ImBD on GPT-revised texts.

02

HLPD surpasses Fast-DetectGPT by 45.56% in detecting GPT-revised texts.

03

HLPD achieves the highest average AUROC on advanced LLM-generated texts.

Abstract

To prevent misinformation and social issues arising from trustworthy-looking content generated by LLMs, it is crucial to develop efficient and reliable methods for identifying the source of texts. Previous approaches have demonstrated exceptional performance in detecting texts fully generated by LLMs. However, these methods struggle when confronting more advanced LLM output or text with adversarial multi-task machine revision, especially in the black-box setting, where the generating model is unknown. To address this challenge, grounded in the hypothesis that human writing possesses distinctive stylistic patterns, we propose Human Language Preference Detection (HLPD). HLPD employs a reward-based alignment process, Human Language Preference Optimization (HLPO), to shift the scoring model's token distribution toward human-like writing, making the model more sensitive to human writing,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMisinformation and Its Impacts · Topic Modeling · Authorship Attribution and Profiling