OWLEYE: Zero-Shot Learner for Cross-Domain Graph Data Anomaly Detection
Lecheng Zheng, Dongqi Fu, Zihao Li, Jingrui He

TL;DR
OWLEYE is a zero-shot graph anomaly detection framework that learns transferable normal behavior patterns across domains, enabling effective anomaly detection without retraining or labeled data, and demonstrating superior generalization on real datasets.
Contribution
It introduces a novel cross-domain feature alignment, multi-pattern dictionary learning, and attention-based reconstruction modules for zero-shot anomaly detection in graphs.
Findings
Outperforms state-of-the-art baselines in real-world datasets.
Demonstrates strong generalization across multiple domains.
Enables label-efficient anomaly detection without retraining.
Abstract
Graph data is informative to represent complex relationships such as transactions between accounts, communications between devices, and dependencies among machines or processes. Correspondingly, graph anomaly detection (GAD) plays a critical role in identifying anomalies across various domains, including finance, cybersecurity, manufacturing, etc. Facing the large-volume and multi-domain graph data, nascent efforts attempt to develop foundational generalist models capable of detecting anomalies in unseen graphs without retraining. To the best of our knowledge, the different feature semantics and dimensions of cross-domain graph data heavily hinder the development of the graph foundation model, leaving further in-depth continual learning and inference capabilities a quite open problem. Hence, we propose OWLEYE, a novel zero-shot GAD framework that learns transferable patterns of normal…
Peer Reviews
Decision·ICLR 2026 Poster
1. Zero-shot cross-domain graph anomaly detection is a significant and practical problem. The paper decomposes this challenge into three stages—feature alignment, pattern learning, and anomaly detection, addressing each sequentially to ensure a technically sound overall solution. 2. The paper clearly identifies the shortcomings of existing general GAD models in feature alignment. It proposes a well-motivated, novel, and effective solution. 3. Extensive experiments across multiple datasets demons
1. The overall innovation of the framework is incremental. Its technical pipeline from feature alignment and multi-hop residual aggregation to a multi-pattern dictionary similar to “context learning” largely follows the established paradigm of generalist GAD models. 2. Section 2.2 thoroughly argues for using “only structural similarity” (Equation 10) to address “camouflaged” anomalies. However, the final reconstruction formula (12) employs an undefined attribute-based similarity `sim(G, Dict_H)`
- The paper addresses a relevant and recent problem - The idea of using dictionary learning for in-context GAD is interesting - The proposed solution outperforms the baselines in most settings
- The paper writing could be greatly improved: Several parts of the paper concern me. Figure 1 is hard to read because of the small font and markers, and the overwhelming amount of information overall. The feature alignment method proposed in Section 2.1 is not well justified (why is it better than other alignment approaches?). The loss from Equation 13 is also not well motivated (is it novel? Why is each element in the loss needed?). In general, Section 2 should include citations for all ideas
1. This paper introduces a dynamic dictionary to store attribute-level and structure-level normal patterns, supporting incremental knowledge updates. 2. Avoiding noise pollution from spurious normal node sampling is a crucial issue. Truncated attention mechanisms, as a soft filtering method, are more robust than random sampling or hard thresholding. 3. Visualized feature analysis demonstrates a deep understanding of the data.
1. This paper claims that the runcated attention mechanism "filters out potential abnormal nodes," but it doesn't provide theoretical analysis or comparisons with other attention mechanisms. 2. Dictionary learning mechanisms implicitly assume "pattern transferability," and I have concerns about the general applicability of this assumption. 3. PCA is an unsupervised linear dimensionality reduction method. It can only guarantee dimensionality uniformity, but it cannot guarantee deep semantic ali
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Graph Neural Networks · Graph Theory and Algorithms · Anomaly Detection Techniques and Applications
