Feature-Augmented Transformers for Robust AI-Text Detection Across Domains and Generators

Mohamed Mady; Johannes Reschke; Bj\"orn Schuller

arXiv:2605.03969·cs.CL·May 6, 2026

Feature-Augmented Transformers for Robust AI-Text Detection Across Domains and Generators

Mohamed Mady, Johannes Reschke, Bj\"orn Schuller

PDF

TL;DR

This paper introduces feature-augmented transformer detectors that significantly improve robustness in AI-text detection across different domains and generators, outperforming previous models especially under distribution shifts.

Contribution

The study proposes a novel feature augmentation method combined with a DeBERTa backbone to enhance transferability and robustness of AI-text detectors across diverse datasets and generation methods.

Findings

01

Base models achieve up to 99.5% in-domain accuracy.

02

Feature augmentation improves transfer performance, reaching 85.9% on M4.

03

The proposed method outperforms zero-shot baselines by up to 7.22 points.

Abstract

AI-generated text is nowadays produced at scale across domains and heterogeneous generation pipelines, making robustness to distribution shift a central requirement for supervised binary detectors. We train transformer-based detectors on HC3 PLUS and calibrate a single decision threshold by maximising balanced accuracy on held-out validation; this threshold is then kept fixed for all downstream test distributions, revealing domain- and generator-dependent error asymmetries under shift. We evaluate in-domain on HC3 PLUS, under cross-dataset transfer to the multi-domain, multi-generator M4 benchmark, and on the external AI-Text-Detection-Pile. Although base models achieve near-ceiling in-domain performance (up to 99.5% balanced accuracy), performance under shift is brittle and strongly model-dependent. Feature augmentation via attention-based linguistic feature fusion improves transfer,…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.