Random Similarity Isolation Forests

Sebastian Chwilczy\'nski; Dariusz Brzezinski

arXiv:2502.19122·cs.LG·July 1, 2025

Random Similarity Isolation Forests

Sebastian Chwilczy\'nski, Dariusz Brzezinski

PDF

TL;DR

This paper introduces Random Similarity Isolation Forest, a novel multi-modal outlier detection method that effectively handles datasets with mixed data types, outperforming existing algorithms on diverse benchmarks.

Contribution

The paper proposes a new outlier detection algorithm capable of processing multi-modal data without data fusion or transformation, advancing the field of anomaly detection.

Findings

01

Outperforms five state-of-the-art competitors on 47 benchmark datasets.

02

Effectively handles datasets with mixed data types such as time series, images, and graphs.

03

Highlights the importance of multi-modal approaches in improving anomaly detection.

Abstract

With predictive models becoming prevalent, companies are expanding the types of data they gather. As a result, the collected datasets consist not only of simple numerical features but also more complex objects such as time series, images, or graphs. Such multi-modal data have the potential to improve performance in predictive tasks like outlier detection, where the goal is to identify objects deviating from the main data distribution. However, current outlier detection algorithms are dedicated to individual types of data. Consequently, working with mixed types of data requires either fusing multiple data-specific models or transforming all of the representations into a single format, both of which can hinder predictive performance. In this paper, we propose a multi-modal outlier detection algorithm called Random Similarity Isolation Forest. Our method combines the notions of isolation…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.