Sarang at DEFACTIFY 4.0: Detecting AI-Generated Text Using Noised Data   and an Ensemble of DeBERTa Models

Avinash Trivedi; Sangeetha Sivanesan

arXiv:2502.16857·cs.CL·February 25, 2025

Sarang at DEFACTIFY 4.0: Detecting AI-Generated Text Using Noised Data and an Ensemble of DeBERTa Models

Avinash Trivedi, Sangeetha Sivanesan

PDF

Open Access

TL;DR

This paper introduces a noise-augmented ensemble DeBERTa approach for detecting AI-generated text, achieving top performance in a shared task by enhancing robustness and accuracy.

Contribution

The paper presents a novel noise-injection technique combined with an ensemble of DeBERTa models for improved AI text detection, winning first place in the shared task.

Findings

01

Achieved F1 score of 1.0 in AI-generated text detection

02

Demonstrated robustness through noise-augmented training

03

Set new benchmark in AI text detection performance

Abstract

This paper presents an effective approach to detect AI-generated text, developed for the Defactify 4.0 shared task at the fourth workshop on multimodal fact checking and hate speech detection. The task consists of two subtasks: Task-A, classifying whether a text is AI generated or human written, and Task-B, classifying the specific large language model that generated the text. Our team (Sarang) achieved the 1st place in both tasks with F1 scores of 1.0 and 0.9531, respectively. The methodology involves adding noise to the dataset to improve model robustness and generalization. We used an ensemble of DeBERTa models to effectively capture complex patterns in the text. The result indicates the effectiveness of our noise-driven and ensemble-based approach, setting a new standard in AI-generated text detection and providing guidance for future developments.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods

MethodsHow do I file a dispute with Expedia?*DisputeFastService · DeBERTa