Sarang at DEFACTIFY 4.0: Detecting AI-Generated Text Using Noised Data and an Ensemble of DeBERTa Models
Avinash Trivedi, Sangeetha Sivanesan

TL;DR
This paper introduces a noise-augmented ensemble DeBERTa approach for detecting AI-generated text, achieving top performance in a shared task by enhancing robustness and accuracy.
Contribution
The paper presents a novel noise-injection technique combined with an ensemble of DeBERTa models for improved AI text detection, winning first place in the shared task.
Findings
Achieved F1 score of 1.0 in AI-generated text detection
Demonstrated robustness through noise-augmented training
Set new benchmark in AI text detection performance
Abstract
This paper presents an effective approach to detect AI-generated text, developed for the Defactify 4.0 shared task at the fourth workshop on multimodal fact checking and hate speech detection. The task consists of two subtasks: Task-A, classifying whether a text is AI generated or human written, and Task-B, classifying the specific large language model that generated the text. Our team (Sarang) achieved the 1st place in both tasks with F1 scores of 1.0 and 0.9531, respectively. The methodology involves adding noise to the dataset to improve model robustness and generalization. We used an ensemble of DeBERTa models to effectively capture complex patterns in the text. The result indicates the effectiveness of our noise-driven and ensemble-based approach, setting a new standard in AI-generated text detection and providing guidance for future developments.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Computational and Text Analysis Methods
MethodsHow do I file a dispute with Expedia?*DisputeFastService · DeBERTa
