HC3 Plus: A Semantic-Invariant Human ChatGPT Comparison Corpus
Zhenpeng Su, Xing Wu, Wei Zhou, Guangyuan Ma, Songlin Hu

TL;DR
This paper introduces HC3 Plus, a comprehensive dataset for detecting ChatGPT-generated text across semantic-invariant tasks, highlighting the challenges and benefits of instruction fine-tuning for improved detection accuracy.
Contribution
The paper presents HC3 Plus, a new dataset covering diverse semantic-invariant tasks and evaluates instruction fine-tuning models for better detection of AI-generated content.
Findings
Detection is more challenging in semantic-invariant tasks.
Instruction fine-tuning improves detection performance.
HC3 Plus dataset enhances evaluation scope.
Abstract
ChatGPT has garnered significant interest due to its impressive performance; however, there is growing concern about its potential risks, particularly in the detection of AI-generated content (AIGC), which is often challenging for untrained individuals to identify. Current datasets used for detecting ChatGPT-generated text primarily focus on question-answering tasks, often overlooking tasks with semantic-invariant properties, such as summarization, translation, and paraphrasing. In this paper, we demonstrate that detecting model-generated text in semantic-invariant tasks is more challenging. To address this gap, we introduce a more extensive and comprehensive dataset that incorporates a wider range of tasks than previous work, including those with semantic-invariant properties. In addition, instruction fine-tuning has demonstrated superior performance across various tasks. In this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsTopic Modeling · Natural Language Processing Techniques · Artificial Intelligence in Healthcare and Education
MethodsFocus
