EAGLE: A Domain Generalization Framework for AI-generated Text Detection
Amrita Bhattacharjee, Raha Moraffah, Joshua Garland, Huan Liu

TL;DR
EAGLE is a domain generalization framework that effectively detects AI-generated text from unseen language models by learning domain-invariant features, reducing the need for new labeled data for each model.
Contribution
EAGLE introduces a novel domain generalization approach combining contrastive learning and adversarial training to detect AI-generated text from unseen models.
Findings
EAGLE achieves detection scores within 4.7% of fully supervised detectors on new models.
EAGLE effectively detects text from recent models like GPT-4 and Claude.
The framework reduces the need for labeled data from new language models.
Abstract
With the advancement in capabilities of Large Language Models (LLMs), one major step in the responsible and safe use of such LLMs is to be able to detect text generated by these models. While supervised AI-generated text detectors perform well on text generated by older LLMs, with the frequent release of new LLMs, building supervised detectors for identifying text from such new models would require new labeled training data, which is infeasible in practice. In this work, we tackle this problem and propose a domain generalization framework for the detection of AI-generated text from unseen target generators. Our proposed framework, EAGLE, leverages the labeled data that is available so far from older language models and learns features invariant across these generators, in order to detect text generated by an unknown target generator. EAGLE learns such domain-invariant features by…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling
MethodsAttention Is All You Need · Linear Layer · Layer Normalization · Byte Pair Encoding · Absolute Position Encodings · Position-Wise Feed-Forward Layer · Residual Connection · Multi-Head Attention · Softmax · Dropout
