TL;DR
This paper introduces ArGPT, a new dataset of arguments generated by ChatGPT, along with methodology and benchmarks for identifying and analyzing good, bad, and ugly arguments to combat misinformation.
Contribution
It presents a novel dataset, ArGPT, and a methodology for generating and evaluating diverse arguments from ChatGPT for argumentation tasks.
Findings
ArGPT data correlates well with human arguments
Baseline models perform effectively on argument classification tasks
The dataset supports training systems to detect fake or poor arguments
Abstract
The recent success of Large Language Models (LLMs) has sparked concerns about their potential to spread misinformation. As a result, there is a pressing need for tools to identify ``fake arguments'' generated by such models. To create these tools, examples of texts generated by LLMs are needed. This paper introduces a methodology to obtain good, bad and ugly arguments from argumentative essays produced by ChatGPT, OpenAI's LLM. We then describe a novel dataset containing a set of diverse arguments, ArGPT. We assess the effectiveness of our dataset and establish baselines for several argumentation-related tasks. Finally, we show that the artificially generated data relates well to human argumentation and thus is useful as a tool to train and test systems for the defined tasks.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsSparse Evolutionary Training
