GigaCheck: Detecting LLM-generated Content via Object-Centric Span Localization

Irina Tolstykh; Aleksandra Tsybina; Sergey Yakubson; Aleksandr Gordeev; Vladimir Dokholyan; Maksim Kuprashevich

arXiv:2410.23728·cs.CL·April 15, 2026

GigaCheck: Detecting LLM-generated Content via Object-Centric Span Localization

Irina Tolstykh, Aleksandra Tsybina, Sergey Yakubson, Aleksandr Gordeev, Vladimir Dokholyan, Maksim Kuprashevich

PDF

1 Repo 2 Models

TL;DR

GigaCheck introduces a dual-strategy framework combining document-level authorship detection with span-level localization by treating generated text segments as objects, leveraging vision models for improved detection robustness.

Contribution

The paper presents a novel approach that adapts visual object detection models for precise localization of AI-generated text spans, enhancing detection robustness and generalization.

Findings

01

High accuracy in authorship classification across multiple benchmarks.

02

Effective localization of generated text spans using DETR-like models.

03

Demonstrated robustness and generalization of the approach across tasks.

Abstract

With the increasing quality and spread of LLM assistants, the amount of generated content is growing rapidly. In many cases and tasks, such texts are already indistinguishable from those written by humans, and the quality of generation continues to increase. At the same time, detection methods are advancing more slowly than generation models, making it challenging to prevent misuse of generative AI technologies. We propose GigaCheck, a dual-strategy framework for AI-generated text detection. At the document level, we leverage the representation learning of fine-tuned LLMs to discern authorship with high data efficiency. At the span level, we introduce a novel structural adaptation that treats generated text segments as "objects." By integrating a DETR-like vision model with linguistic encoders, we achieve precise localization of AI intervals, effectively transferring the robustness of…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

null
github

Models

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.