TL;DR
This paper introduces SALF, a symbolic adversarial learning framework that enhances fake news generation and detection by iterative, language-based adversarial interactions, significantly challenging current detection methods and improving detector robustness.
Contribution
The work presents a novel symbolic adversarial training paradigm for fake news, moving beyond neural updates to language-based agent interactions, improving the generation of sophisticated fake news and detector refinement.
Findings
SALF degrades state-of-the-art detection performance by up to 53.4% in Chinese and 34.2% in English.
SALF improves detection of refined fake news by up to 7.7%.
Experiments on multilingual datasets validate SALF's effectiveness.
Abstract
Rapid LLM advancements heighten fake news risks by enabling the automatic generation of increasingly sophisticated misinformation. Previous detection methods, including fine-tuned small models or LLM-based detectors, often struggle with its dynamically evolving nature. In this work, we propose a novel framework called the Symbolic Adversarial Learning Framework (SALF), which implements an adversarial training paradigm by an agent symbolic learning optimization process, rather than relying on numerical updates. SALF introduces a paradigm where the generation agent crafts deceptive narratives, and the detection agent uses structured debates to identify logical and factual flaws for detection, and they iteratively refine themselves through such adversarial interactions. Unlike traditional neural updates, we represent agents using agent symbolic learning, where learnable weights are defined…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
