LLM Detectors Still Fall Short of Real World: Case of LLM-Generated   Short News-Like Posts

Henrique Da Silva Gameiro; Andrei Kucharavy; Ljiljana Dolamic

arXiv:2409.03291·cs.CL·September 30, 2024

LLM Detectors Still Fall Short of Real World: Case of LLM-Generated Short News-Like Posts

Henrique Da Silva Gameiro, Andrei Kucharavy, Ljiljana Dolamic

PDF

Open Access 2 Repos

TL;DR

Existing LLM detectors are ineffective in real-world scenarios involving short news-like posts, showing vulnerability to simple attacks and poor generalization, highlighting the need for improved, domain-specific benchmarking methods.

Contribution

This paper demonstrates the limitations of current LLM detectors in real-world settings and proposes a new, extensible benchmark for evaluating their robustness and generalization.

Findings

01

Zero-shot detectors perform inconsistently and are vulnerable to temperature attacks.

02

Purpose-trained detectors struggle to generalize to new human-written texts.

03

Benchmarking approaches need re-evaluation to better reflect real-world challenges.

Abstract

With the emergence of widely available powerful LLMs, disinformation generated by large Language Models (LLMs) has become a major concern. Historically, LLM detectors have been touted as a solution, but their effectiveness in the real world is still to be proven. In this paper, we focus on an important setting in information operations -- short news-like posts generated by moderately sophisticated attackers. We demonstrate that existing LLM detectors, whether zero-shot or purpose-trained, are not ready for real-world use in that setting. All tested zero-shot detectors perform inconsistently with prior benchmarks and are highly vulnerable to sampling temperature increase, a trivial attack absent from recent benchmarks. A purpose-trained detector generalizing across LLMs and unseen attacks can be developed, but it fails to generalize to new human-written texts. We argue that the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsScientific Computing and Data Management · Mathematics, Computing, and Information Processing · Research Data Management Practices

MethodsFocus