GuARD: Effective Anomaly Detection through a Text-Rich and Graph-Informed Language Model

Yunhe Pang; Bo Chen; Fanjin Zhang; Yanghui Rao; Evgeny Kharlamov; Jie Tang

arXiv:2412.03930·cs.CL·August 8, 2025

GuARD: Effective Anomaly Detection through a Text-Rich and Graph-Informed Language Model

Yunhe Pang, Bo Chen, Fanjin Zhang, Yanghui Rao, Evgeny Kharlamov, Jie Tang

PDF

Open Access 1 Repo

TL;DR

GuARD is a novel model that combines graph structure and rich textual information using multi-modal instruction tuning, significantly improving anomaly detection accuracy and efficiency on text-rich graphs.

Contribution

This paper introduces GuARD, a new approach that integrates graph structural features with language models for anomaly detection, addressing limitations of existing methods.

Findings

01

GuARD outperforms existing graph-based and LLM-based anomaly detection methods.

02

GuARD achieves up to 5× speedup in training and inference.

03

GuARD demonstrates superior detection accuracy on four datasets.

Abstract

Anomaly detection on text-rich graphs is widely prevalent in real life, such as detecting incorrectly assigned academic papers to authors and detecting bots in social networks. The remarkable capabilities of large language models (LLMs) pave a new revenue by utilizing rich-text information for effective anomaly detection. However, simply introducing rich texts into LLMs can obscure essential detection cues and introduce high fine-tuning costs. Moreover, LLMs often overlook the intrinsic structural bias of graphs which is vital for distinguishing normal from abnormal node patterns. To this end, this paper introduces GuARD, a text-rich and graph-informed language model that combines key structural features from graph-based methods with fine-grained semantic attributes extracted via small language models for effective anomaly detection on text-rich graphs. GuARD is optimized with the…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

thudm/whoiswho
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsNatural Language Processing Techniques · Topic Modeling