TAGFN: A Text-Attributed Graph Dataset for Fake News Detection in the Age of LLMs
Kay Liu, Yuwei Han, Haoyan Xu, Henry Peng Zou, Yue Zhao, Philip S. Yu

TL;DR
The paper introduces TAGFN, a large-scale, real-world text-attributed graph dataset designed to improve fake news detection and outlier detection methods, enabling better evaluation and development of LLM-based solutions.
Contribution
It provides a new, well-annotated dataset for fake news detection, facilitating research on graph outlier detection and LLM fine-tuning for misinformation detection.
Findings
Enables rigorous evaluation of traditional and LLM-based outlier detection methods.
Supports development of misinformation detection capabilities in LLMs.
Fosters progress in trustworthy AI and robust graph-based detection.
Abstract
Large Language Models (LLMs) have recently revolutionized machine learning on text-attributed graphs, but the application of LLMs to graph outlier detection, particularly in the context of fake news detection, remains significantly underexplored. One of the key challenges is the scarcity of large-scale, realistic, and well-annotated datasets that can serve as reliable benchmarks for outlier detection. To bridge this gap, we introduce TAGFN, a large-scale, real-world text-attributed graph dataset for outlier detection, specifically fake news detection. TAGFN enables rigorous evaluation of both traditional and LLM-based graph outlier detection methods. Furthermore, it facilitates the development of misinformation detection capabilities in LLMs through fine-tuning. We anticipate that TAGFN will be a valuable resource for the community, fostering progress in robust graph-based outlier…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Misinformation and Its Impacts · Multimodal Machine Learning Applications
