Sharif-MGTD at SemEval-2024 Task 8: A Transformer-Based Approach to Detect Machine Generated Text
Seyedeh Fatemeh Ebrahimi, Karim Akhavan Azari, Amirmasoud Iravani,, Arian Qazvini, Pouya Sadeghi, Zeinab Sadat Taghavi, Hossein Sameti

TL;DR
This paper presents a transformer-based approach using fine-tuned RoBERTa to detect machine-generated text, achieving competitive accuracy in a SemEval-2024 task with limited hardware resources.
Contribution
The study demonstrates the effectiveness of fine-tuning RoBERTa for MGT detection within resource constraints, focusing on monolingual English texts in a competitive setting.
Findings
Achieved 78.9% accuracy on test data
Ranked 57th among participants
Effective in identifying human-written text
Abstract
Detecting Machine-Generated Text (MGT) has emerged as a significant area of study within Natural Language Processing. While language models generate text, they often leave discernible traces, which can be scrutinized using either traditional feature-based methods or more advanced neural language models. In this research, we explore the effectiveness of fine-tuning a RoBERTa-base transformer, a powerful neural architecture, to address MGT detection as a binary classification task. Focusing specifically on Subtask A (Monolingual-English) within the SemEval-2024 competition framework, our proposed system achieves an accuracy of 78.9% on the test dataset, positioning us at 57th among participants. Our study addresses this challenge while considering the limited hardware resources, resulting in a system that excels at identifying human-written texts but encounters challenges in accurately…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques
