FAID: Fine-Grained AI-Generated Text Detection Using Multi-Task Auxiliary and Multi-Level Contrastive Learning

Minh Ngoc Ta; Dong Cao Van; Duc-Anh Hoang; Minh Le-Anh; Truong Nguyen; My Anh Tran Nguyen; Yuxia Wang; Preslav Nakov; Sang Dinh

arXiv:2505.14271·cs.CL·February 10, 2026

FAID: Fine-Grained AI-Generated Text Detection Using Multi-Task Auxiliary and Multi-Level Contrastive Learning

Minh Ngoc Ta, Dong Cao Van, Duc-Anh Hoang, Minh Le-Anh, Truong Nguyen, My Anh Tran Nguyen, Yuxia Wang, Preslav Nakov, Sang Dinh

PDF

Open Access 1 Repo 1 Datasets 1 Video

TL;DR

FAID is a novel multi-task, multi-level contrastive learning framework designed to accurately detect and classify AI-generated, human-written, and hybrid texts across multiple languages and domains, including identifying specific LLM families.

Contribution

The paper introduces FAID, a fine-grained detection model that captures stylistic cues and generalizes well to unseen data, advancing AI-generated text detection beyond binary classification.

Findings

01

FAID outperforms baseline models in accuracy.

02

Enhanced generalization to unseen domains and LLMs.

03

Effective identification of LLM families as stylistic entities.

Abstract

The growing collaboration between humans and AI models in generative tasks has introduced new challenges in distinguishing between human-written, LLM-generated, and human-LLM collaborative texts. In this work, we collect a multilingual, multi-domain, multi-generator dataset FAIDSet. We further introduce a fine-grained detection framework FAID to classify text into these three categories, and also to identify the underlying LLM family of the generator. Unlike existing binary classifiers, FAID is built to capture both authorship and model-specific characteristics. Our method combines multi-level contrastive learning with multi-task auxiliary classification to learn subtle stylistic cues. By modeling LLM families as distinct stylistic entities, we incorporate an adaptation to address distributional shifts without retraining for unseen data. Our experimental results demonstrate that FAID…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

ngocminhta/faid
pytorchOfficial

Datasets

ngocminhta/FAIDSet
dataset· 38 dl
38 dl

Videos

FAID: Fine-grained AI-generated Text Detection using Multi-task Auxiliary and Multi-level Contrastive Learning· underline

Taxonomy

TopicsHandwritten Text Recognition Techniques · Text and Document Classification Technologies · Natural Language Processing Techniques

MethodsContrastive Learning