SemEval-2024 Task 8: Weighted Layer Averaging RoBERTa for Black-Box Machine-Generated Text Detection
Ayan Datta, Aryan Chandramania, Radhika Mamidi

TL;DR
This paper presents a method using weighted layer averaging of RoBERTa to improve detection of machine-generated text across multiple languages and domains, addressing the growing challenge posed by large language models.
Contribution
The authors introduce a novel approach of weighted layer averaging in RoBERTa for black-box machine-generated text detection, enhancing performance across multilingual and multidomain settings.
Findings
Effective in detecting machine-generated text across languages
Improves detection accuracy over baseline models
Applicable to various domains and LLM outputs
Abstract
This document contains the details of the authors' submission to the proceedings of SemEval 2024's Task 8: Multigenerator, Multidomain, and Multilingual Black-Box Machine-Generated Text Detection Subtask A (monolingual) and B. Detection of machine-generated text is becoming an increasingly important task, with the advent of large language models (LLMs). In this paper, we lay out how using weighted averages of RoBERTa layers lets us capture information about text that is relevant to machine-generated text detection.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Text and Document Classification Technologies · Topic Modeling
MethodsAttention Is All You Need · Refunds@Expedia|||How do I get a full refund from Expedia? · Softmax · WordPiece · Weight Decay · Linear Layer · Layer Normalization · Dense Connections · Attention Dropout · Residual Connection
