Detecting the Machine: A Comprehensive Benchmark of AI-Generated Text Detectors Across Architectures, Domains, and Adversarial Conditions

Madhav S. Baidya; S. S. Baidya; Chirag Chawla

arXiv:2603.17522·cs.CL·March 19, 2026

Detecting the Machine: A Comprehensive Benchmark of AI-Generated Text Detectors Across Architectures, Domains, and Adversarial Conditions

Madhav S. Baidya, S. S. Baidya, Chirag Chawla

PDF

Open Access

TL;DR

This paper benchmarks various AI-generated text detectors across multiple models, domains, and adversarial conditions, revealing significant limitations in generalization and robustness of current detection methods.

Contribution

It provides a comprehensive evaluation of diverse detection approaches across multiple datasets and conditions, highlighting their strengths and weaknesses.

Findings

01

Transformer models perform well in-distribution but poorly under domain shift.

02

XGBoost stylometric model is interpretable and matches performance.

03

LLM-based detectors underperform and are affected by bias.

Abstract

The rapid proliferation of large language models (LLMs) has created an urgent need for robust and generalizable detectors of machine-generated text. Existing benchmarks typically evaluate a single detector on a single dataset under ideal conditions, leaving open questions about cross-domain transfer, cross-LLM generalization, and adversarial robustness. We present a comprehensive benchmark evaluating diverse detection approaches across two corpora: HC3 (23,363 human-ChatGPT pairs) and ELI5 (15,000 human-Mistral-7B pairs). Methods include classical classifiers, fine-tuned transformer encoders (BERT, RoBERTa, ELECTRA, DistilBERT, DeBERTa-v3), a CNN, an XGBoost stylometric model, perplexity-based detectors, and LLM-as-detector prompting. Results show that transformer models achieve near-perfect in-distribution performance but degrade under domain shift. The XGBoost stylometric model…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Hate Speech and Cyberbullying Detection · Topic Modeling