How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation,   and Detection

Biyang Guo; Xin Zhang; Ziyuan Wang; Minqi Jiang; Jinran Nie; Yuxuan; Ding; Jianwei Yue; Yupeng Wu

arXiv:2301.07597·cs.CL·January 19, 2023·292 cites

How Close is ChatGPT to Human Experts? Comparison Corpus, Evaluation, and Detection

Biyang Guo, Xin Zhang, Ziyuan Wang, Minqi Jiang, Jinran Nie, Yuxuan, Ding, Jianwei Yue, Yupeng Wu

PDF

Open Access 3 Repos 10 Models 5 Datasets

TL;DR

This paper introduces HC3, a large dataset comparing ChatGPT and human responses across various domains, analyzes their differences, and develops detection methods to distinguish AI-generated from human text.

Contribution

It provides the HC3 dataset for evaluating ChatGPT versus human responses and proposes effective detection systems for AI-generated text.

Findings

01

ChatGPT responses differ significantly from human experts in style and content.

02

Detection systems can effectively identify AI-generated text with high accuracy.

03

Linguistic and content analysis reveal key gaps between ChatGPT and human responses.

Abstract

The introduction of ChatGPT has garnered widespread attention in both academic and industrial communities. ChatGPT is able to respond effectively to a wide range of human questions, providing fluent and comprehensive answers that significantly surpass previous public chatbots in terms of security and usefulness. On one hand, people are curious about how ChatGPT is able to achieve such strength and how far it is from human experts. On the other hand, people are starting to worry about the potential negative impacts that large language models (LLMs) like ChatGPT could have on society, such as fake news, plagiarism, and social security issues. In this work, we collected tens of thousands of comparison responses from both human experts and ChatGPT, with questions ranging from open-domain, financial, medical, legal, and psychological areas. We call the collected dataset the Human ChatGPT…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

Models

Datasets

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsArtificial Intelligence in Healthcare and Education · Topic Modeling · Text Readability and Simplification