Mast Kalandar at SemEval-2024 Task 8: On the Trail of Textual Origins:   RoBERTa-BiLSTM Approach to Detect AI-Generated Text

Jainit Sushil Bafna; Hardik Mittal; Suyash Sethia; Manish Shrivastava,; Radhika Mamidi

arXiv:2407.02978·cs.CL·July 4, 2024

Mast Kalandar at SemEval-2024 Task 8: On the Trail of Textual Origins: RoBERTa-BiLSTM Approach to Detect AI-Generated Text

Jainit Sushil Bafna, Hardik Mittal, Suyash Sethia, Manish Shrivastava,, Radhika Mamidi

PDF

Open Access

TL;DR

This paper presents a RoBERTa-BiLSTM classifier for detecting AI-generated text, addressing misuse concerns, and compares its performance with baselines in the SemEval-2024 task, achieving 80.83% accuracy.

Contribution

Introduces a novel RoBERTa-BiLSTM model for AI text detection and provides a comparative analysis with baseline methods in a challenging multilingual setting.

Findings

01

Achieved 80.83% accuracy on the SemEval-2024 leaderboard.

02

Model outperforms baseline approaches in detecting AI-generated text.

03

Contributes to automatic detection systems for preventing misuse of AI-generated content.

Abstract

Large Language Models (LLMs) have showcased impressive abilities in generating fluent responses to diverse user queries. However, concerns regarding the potential misuse of such texts in journalism, educational, and academic contexts have surfaced. SemEval 2024 introduces the task of Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection, aiming to develop automated systems for identifying machine-generated text and detecting potential misuse. In this paper, we i) propose a RoBERTa-BiLSTM based classifier designed to classify text into two categories: AI-generated or human ii) conduct a comparative study of our model with baseline approaches to evaluate its effectiveness. This paper contributes to the advancement of automatic text detection systems in addressing the challenges posed by machine-generated text misuse. Our architecture ranked 46th on the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAuthorship Attribution and Profiling · Computational and Text Analysis Methods · Language and cultural evolution