Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to   Detect Machine Generated Text

Seyedeh Fatemeh Ebrahimi; Karim Akhavan Azari; Amirmasoud Iravani,; Arian Qazvini; Pouya Sadeghi; Zeinab Sadat Taghavi; Hossein Sameti

arXiv:2407.11774·cs.CL·February 19, 2025

Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text

Seyedeh Fatemeh Ebrahimi, Karim Akhavan Azari, Amirmasoud Iravani,, Arian Qazvini, Pouya Sadeghi, Zeinab Sadat Taghavi, Hossein Sameti

PDF

Open Access

TL;DR

This paper presents a transformer-based approach using fine-tuned RoBERTa to detect machine-generated text, achieving competitive accuracy in a SemEval-2024 task with limited hardware resources.

Contribution

The study demonstrates the effectiveness of fine-tuning RoBERTa for MGT detection within resource constraints, focusing on monolingual English texts in a competitive setting.

Findings

01

Achieved 78.9% accuracy on test data

02

Ranked 57th among participants

03

Effective in identifying human-written text

Abstract

Detecting Machine-Generated Text (MGT) has emerged as a significant area of study within Natural Language Processing. While language models generate text, they often leave discernible traces, which can be scrutinized using either traditional feature-based methods or more advanced neural language models. In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection as a binary classification task. Focusing specifically on Subtask A (Monolingual-English) within the SemEval-2024 competition framework, our proposed system achieves an accuracy of 78.9% on the test dataset, positioning us at 57th among participants. Our study addresses this challenge while considering the limited hardware resources, resulting in a system that excels at identifying human-written texts but encounters challenges in accurately…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Natural Language Processing Techniques