RKadiyala at SemEval-2024 Task 8: Black-Box Word-Level Text Boundary Detection in Partially Machine Generated Texts
Ram Mohan Rao Kadiyala

TL;DR
This paper introduces methods for word-level detection of machine-generated segments within texts, addressing a gap in existing models that only classify entire texts as human or machine authored.
Contribution
The paper presents novel approaches for identifying machine-generated text at the word level, improving detection accuracy across unseen domains and different generators.
Findings
Significant improvement in detection accuracy.
Effective on unseen domains and generators.
Suitable for identifying machine-generated segments in LLM outputs.
Abstract
With increasing usage of generative models for text generation and widespread use of machine generated texts in various domains, being able to distinguish between human written and machine generated texts is a significant challenge. While existing models and proprietary systems focus on identifying whether given text is entirely human written or entirely machine generated, only a few systems provide insights at sentence or paragraph level at likelihood of being machine generated at a non reliable accuracy level, working well only for a set of domains and generators. This paper introduces few reliable approaches for the novel task of identifying which part of a given text is machine generated at a word level while comparing results from different approaches and methods. We present a comparison with proprietary systems , performance of our model on unseen domains' and generators' texts.…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNatural Language Processing Techniques · Topic Modeling · Text Readability and Simplification
MethodsNetwork On Network · Sparse Evolutionary Training · Focus
