TL;DR
MacrOData introduces a comprehensive, large-scale benchmark suite with over 2,400 datasets for evaluating tabular outlier detection methods, addressing limitations of previous benchmarks.
Contribution
The paper presents MacrOData, a new extensive benchmark suite with diverse datasets, standardized splits, and metadata, enabling robust evaluation of outlier detection techniques.
Findings
Extensive evaluation of classical, deep, and foundation models across all benchmarks.
MacrOData's scale and diversity improve the robustness of outlier detection evaluation.
Public leaderboard and open datasets facilitate future research and benchmarking.
Abstract
Quality benchmarks are essential for fairly and accurately tracking scientific progress and enabling practitioners to make informed methodological choices. Outlier detection (OD) on tabular data underpins numerous real-world applications, yet existing OD benchmarks remain limited. The prominent OD benchmark AdBench is the de facto standard in the literature, yet comprises only 57 datasets. In addition to other shortcomings discussed in this work, its small scale severely restricts diversity and statistical power. We introduce MacrOData, a large-scale benchmark suite for tabular OD comprising three carefully curated components: OddBench, with 790 datasets containing real-world semantic anomalies; OvrBench, with 856 datasets featuring real-world statistical outliers; and SynBench, with 800 synthetically generated datasets spanning diverse data priors and outlier archetypes. Owing to its…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
