A Survey on Proactive Defense Strategies Against Misinformation in Large Language Models
Shuliang Liu, Hongyi Liu, Aiwei Liu, Bingchen Duan, Qi Zheng, Yibo Yan, He Geng, Peijie Jiang, Jia Liu, and Xuming Hu

TL;DR
This survey reviews proactive defense strategies for large language models to combat misinformation, emphasizing a shift from detection to anticipatory mitigation, and highlights their effectiveness and challenges.
Contribution
It introduces a Three Pillars framework for proactive defense and provides a comprehensive survey and meta-analysis of existing techniques in this area.
Findings
Proactive strategies improve misinformation prevention by up to 63%.
They face challenges like computational overhead and generalization issues.
The paper outlines future directions for robust LLM defenses.
Abstract
The widespread deployment of large language models (LLMs) across critical domains has amplified the societal risks posed by algorithmically generated misinformation. Unlike traditional false content, LLM-generated misinformation can be self-reinforcing, highly plausible, and capable of rapid propagation across multiple languages, which traditional detection methods fail to mitigate effectively. This paper introduces a proactive defense paradigm, shifting from passive post hoc detection to anticipatory mitigation strategies. We propose a Three Pillars framework: (1) Knowledge Credibility, fortifying the integrity of training and deployed data; (2) Inference Reliability, embedding self-corrective mechanisms during reasoning; and (3) Input Robustness, enhancing the resilience of model interfaces against adversarial attacks. Through a comprehensive survey of existing techniques and a…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
Taxonomy
TopicsMisinformation and Its Impacts · Adversarial Robustness in Machine Learning · Explainable Artificial Intelligence (XAI)
MethodsHigh-Order Consensuses · Focus
