LuxVeri at GenAI Detection Task 1: Inverse Perplexity Weighted Ensemble   for Robust Detection of AI-Generated Text across English and Multilingual   Contexts

Md Kamrujjaman Mobin; Md Saiful Islam

arXiv:2501.11914·cs.CL·January 22, 2025

LuxVeri at GenAI Detection Task 1: Inverse Perplexity Weighted Ensemble for Robust Detection of AI-Generated Text across English and Multilingual Contexts

Md Kamrujjaman Mobin, Md Saiful Islam

PDF

Open Access

TL;DR

This paper introduces an inverse perplexity weighted ensemble method for detecting AI-generated text, achieving high accuracy in both English and multilingual contexts, and demonstrating the effectiveness of ensemble techniques in this domain.

Contribution

The paper proposes a novel inverse perplexity weighted ensemble approach for robust AI-generated text detection across multiple languages, advancing beyond existing single-model methods.

Findings

01

Achieved a Macro F1-score of 0.7458 for English detection

02

Achieved a Macro F1-score of 0.7513 for multilingual detection

03

Demonstrated the effectiveness of inverse perplexity weighting in ensemble models

Abstract

This paper presents a system developed for Task 1 of the COLING 2025 Workshop on Detecting AI-Generated Content, focusing on the binary classification of machine-generated versus human-written text. Our approach utilizes an ensemble of models, with weights assigned according to each model's inverse perplexity, to enhance classification accuracy. For the English text detection task, we combined RoBERTa-base, RoBERTa-base with the OpenAI detector, and BERT-base-cased, achieving a Macro F1-score of 0.7458, which ranked us 12th out of 35 teams. We ensembled RemBERT, XLM-RoBERTa-base, and BERT-base-multilingual-case for the multilingual text detection task, employing the same inverse perplexity weighting technique. This resulted in a Macro F1-score of 0.7513, positioning us 4th out of 25 teams. Our results demonstrate the effectiveness of inverse perplexity weighting in improving the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques