Can LLM-Generated Misinformation Be Detected?

Canyu Chen; Kai Shu

arXiv:2309.13788·cs.CL·April 25, 2024·36 cites

Can LLM-Generated Misinformation Be Detected?

Canyu Chen, Kai Shu

PDF

Open Access 1 Repo 3 Reviews

TL;DR

This paper investigates the difficulty of detecting misinformation generated by Large Language Models, revealing that such misinformation can be more deceptive and harder to identify than human-written content, raising concerns for online safety.

Contribution

It provides a taxonomy of LLM-generated misinformation, categorizes generation methods, and empirically demonstrates detection challenges compared to human misinformation.

Findings

01

LLM-generated misinformation is harder for humans to detect.

02

Detection difficulty for LLM misinformation exceeds that of human-written content.

03

Implications for developing more effective misinformation countermeasures.

Abstract

The advent of Large Language Models (LLMs) has made a transformative impact. However, the potential that LLMs such as ChatGPT can be exploited to generate misinformation has posed a serious concern to online safety and public trust. A fundamental research question is: will LLM-generated misinformation cause more harm than human-written misinformation? We propose to tackle this question from the perspective of detection difficulty. We first build a taxonomy of LLM-generated misinformation. Then we categorize and validate the potential real-world methods for generating misinformation with LLMs. Then, through extensive empirical investigation, we discover that LLM-generated misinformation can be harder to detect for humans and detectors compared to human-written misinformation with the same semantics, which suggests it can have more deceptive styles and potentially cause more harm. We also…

Peer Reviews

Decision·ICLR 2024 poster

Reviewer 01Rating 3· reject, not good enoughConfidence 3

Strengths

The paper does a good job at describing the problem statement and their contributions. It's a good survey on the related techniques within this space. - The misinformation taxonomy and the generation strategies of hallucination, Arbitrary Misinformation and Controllable Misinformation generation are interesting to note - Utilizing CoT and non CoT prompting to study LLM based misinformation detection is interesting Overall the paper is a comprehensive study on LLM generated misinformation and re

Weaknesses

The paper lacks a review or comparison with pre-LLM era misinformation or fake news detection strategies. There are techniques within fact-finding and source-attribution space which can be leveraged to detect misinformation and those haven't been discussed. The paper often uses Appendix sections to support the claims made which makes it less readable and less self-contained. The paper establishes what 'detectors' are, rather late. Overall the paper is a comprehensive study on LLM generated mi

Reviewer 02Rating 3· reject, not good enoughConfidence 5

Strengths

The paper is a well-written and informative contribution to the field of misinformation research. It provides important insights into the potential for LLMs to be used to generate deceptive and harmful misinformation. - It is one of the first papers to systematically investigate the detectability of LLM-generated misinformation. - It creates a taxonomy and identifies three different types of LLM-generated misinformation: Hallucinated News Generation, Totally Arbitrary Generation, and Partially A

Weaknesses

Cencern1: The study is relatively small number of evaluators and only evaluates a limited number of LLM-generated news items. This means that the findings of the study may not be generalizable to all LLM-generated news items. Concern 2: The study does not evaluate the effectiveness of different detection methods for LLM-generated misinformation. This means that it is not clear how well existing detection methods would perform at detecting the LLM-generated news items used in the study. Jiameng

Reviewer 03Rating 8· accept, good paperConfidence 5

Strengths

1. Significance of the research question: AI-generated misinformation is a very critical problem for the development of LLM. The development of RLHF-based LLM can make the misinformation creators easily generate misinformation without any preliminary knowledge about deep learning. We urgently needed exploration on the topic. 2. Contribution to the community: This paper discuss the problem in a great details and can provide us with good resources (dataset and prompts) to study this problem. 3.

Weaknesses

1. The dataset seems to be not very large. I understand that for evaluating human detection difficulty, we can not use too large dataset. But the authors can enlarge the dataset for evaluation of machine learning model. 2. For detector difficulty, the authors only discussed the zero-shot detection of generative LLMs. The results on other kinds of models (i.e. in-context-learning boosted LLMs, soft-prompt based LLMs, and encoder-based Large models like BERT and its variants) are not discussed.

Code & Models

Repositories

llm-misinformation/llm-misinformation
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsTopic Modeling · Misinformation and Its Impacts · Artificial Intelligence in Healthcare and Education