UniOD: A Universal Model for Outlier Detection across Diverse Domains
Dazhi Fu, Jicong Fan

TL;DR
UniOD introduces a universal outlier detection framework that leverages labeled data and graph neural networks to effectively identify outliers across diverse datasets without additional tuning.
Contribution
The paper presents UniOD, a novel universal outlier detection model that generalizes across datasets with different features and domains using a unified training approach.
Findings
Outperforms 17 baselines on 30 benchmark datasets.
No need for dataset-specific hyperparameter tuning.
Theoretically guarantees effectiveness of outlier detection.
Abstract
Outlier detection (OD), distinguishing inliers and outliers in completely unlabeled datasets, plays a vital role in science and engineering. Although there have been many insightful OD methods, most of them require troublesome hyperparameter tuning (a challenge in unsupervised learning) and costly model training for every task or dataset. In this work, we propose UniOD, a universal OD framework that leverages labeled datasets to train a single model capable of detecting outliers of datasets with different feature dimensions and heterogeneous feature spaces from diverse domains. Specifically, UniOD extracts uniform and comparable features across different datasets by constructing and factorizing multi-scale point-wise similarity matrices. It then employs graph neural networks to capture comprehensive within-dataset and between-dataset information simultaneously, and formulates outlier…
Peer Reviews
Decision·ICLR 2026 Poster
• A Novel Paradigm for OD: The primary strength is the innovative concept of a universal framework that eliminates the need for per-dataset retraining. This directly addresses a major bottleneck in the practical application of outlier detection. • Elegant Unification of Heterogeneous Data: The use of multi-scale similarity matrices combined with SVD is a powerful and clever technique for creating a unified feature space from datasets with diverse dimensionalities and semantics. • Strong Empirica
• Heavy Reliance on Historical Data Composition: The model's success is fundamentally tied to the quality, scale, and diversity of the historical datasets. The paper lacks an investigation into the sensitivity of the model to the composition of this training pool. • Potential Scalability Bottlenecks: The methodology relies on constructing an n*n similarity matrix, which has a quadratic complexity (O(n²)) with respect to the number of samples. This could be computationally prohibitive for very la
1. Plug-and-play: this paper proposes the UniOD framework by pretraining on various datasets and conduct inference in a zero-shot manner, saving the deployment cost for OD tasks. 2. Unified dataset representation: this paper utilizes the multi-scale similarity and SVD to produce unified node features, enforcing the generalizability of the model. 3. The authors provides both comprehensive theoretical justification and extensive empirical analysis of the method. UniOD is well theoretical grounded.
1. Dependence on historical datasets. UniOD requires labeled historical datasets, which can be unavailable in real-world applications. The limited historical datasets may impair the performance of UniOD on new datasets, especially when the historical datasets is limited to few domains. 2. Generality concern: the effect of dataset variability is not fully discussed in the paper, as this will potentially influence the model's generality if the model hasn't encountered datasets from similar distrib
This paper presents several noteworthy strengths: 1. **Paradigm-Shifting Contribution** - First explicit proposal and systematic formalization of "universal outlier detection" - Core innovation: single model generalizing across diverse domains without retraining or hyperparameter tuning - Represents fundamental reformulation of conventional outlier detection paradigm 2. **Technical Innovation** - Elegant graph reformulation process transforms heterogeneous tabular data - Creates
While the proposed UniOD framework demonstrates compelling performance, several limitations warrant discussion for future improvement: 1. **Scalability Challenges in Preprocessing** The O(n²) computational and memory requirements for similarity matrix construction present practical constraints, as evidenced by the needed subsampling for datasets beyond 6,000 samples. Future work could explore approximate nearest neighbor techniques or sparse graph construction to enhance applicability to larg
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Adversarial Robustness in Machine Learning · Machine Learning and Data Classification
