Stacking the Odds: Transformer-Based Ensemble for AI-Generated Text   Detection

Duke Nguyen; Khaing Myat Noe Naing; Aditya Joshi

arXiv:2310.18906·cs.CL·October 31, 2023·2 cites

Stacking the Odds: Transformer-Based Ensemble for AI-Generated Text Detection

Duke Nguyen, Khaing Myat Noe Naing, Aditya Joshi

PDF

Open Access 1 Repo

TL;DR

This paper presents a stacking ensemble of lightweight Transformer models for AI-generated text detection, achieving high accuracy and demonstrating the effectiveness of model ensembling in this task.

Contribution

It introduces a novel ensemble approach using accessible models that improves detection accuracy over individual models.

Findings

01

Achieved 95.55% accuracy on shared task data.

02

Ensembling lightweight models enhances detection performance.

03

The approach outperforms individual models in AI-generated text detection.

Abstract

This paper reports our submission under the team name `SynthDetectives' to the ALTA 2023 Shared Task. We use a stacking ensemble of Transformers for the task of AI-generated text detection. Our approach is novel in terms of its choice of models in that we use accessible and lightweight models in the ensemble. We show that ensembling the models results in an improved accuracy in comparison with using them individually. Our approach achieves an accuracy score of 0.9555 on the official test data provided by the shared task organisers.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

dukeraphaelng/synth_detectives
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques