Detecting Subtle Differences between Human and Model Languages Using   Spectrum of Relative Likelihood

Yang Xu; Yu Wang; Hao An; Zhichen Liu; Yongyuan Li

arXiv:2406.19874·cs.CL·October 10, 2024

Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood

Yang Xu, Yu Wang, Hao An, Zhichen Liu, Yongyuan Li

PDF

Open Access 1 Repo 1 Video

TL;DR

This paper introduces a novel approach using the spectrum of relative likelihoods to detect subtle differences between human and AI-generated texts, achieving state-of-the-art results and providing insights rooted in psycholinguistics.

Contribution

It presents a new detection method based on relative likelihood spectrum analysis, outperforming previous zero-shot techniques and revealing subtle language differences with theoretical backing.

Findings

01

Achieves state-of-the-art short-text detection performance

02

Provides a new perspective using relative likelihood spectrum analysis

03

Reveals subtle linguistic differences rooted in psycholinguistics

Abstract

Human and model-generated texts can be distinguished by examining the magnitude of likelihood in language. However, it is becoming increasingly difficult as language model's capabilities of generating human-like texts keep evolving. This study provides a new perspective by using the relative likelihood values instead of absolute ones, and extracting useful features from the spectrum-view of likelihood for the human-model text detection task. We propose a detection procedure with two classification methods, supervised and heuristic-based, respectively, which results in competitive performances with previous zero-shot detection methods and a new state-of-the-art on short-text detection. Our method can also reveal subtle differences between human and model languages, which find theoretical roots in psycholinguistics studies. Our code is available at…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

clcs-sustech/fouriergpt
pytorchOfficial

Videos

Detecting Subtle Differences between Human and Model Languages Using Spectrum of Relative Likelihood· underline

Taxonomy

TopicsFractal and DNA sequence analysis