Reduced Robust Random Cut Forest for Out-Of-Distribution detection in   machine learning models

Harsh Vardhan; Janos Sztipanovits

arXiv:2206.09247·cs.LG·June 22, 2022·1 cites

Reduced Robust Random Cut Forest for Out-Of-Distribution detection in machine learning models

Harsh Vardhan, Janos Sztipanovits

PDF

Open Access

TL;DR

This paper introduces a Reduced Robust Random Cut Forest (RRRCF) method for efficiently detecting out-of-distribution data in machine learning models, suitable for both small and large datasets, with easy training and no complex hyper-parameter tuning.

Contribution

The paper presents a novel RRRCF approach that simplifies the RRCF structure for effective OOD detection across various dataset sizes.

Findings

01

Efficient inference for in/out-of-distribution data.

02

Applicable to low and high-dimensional data.

03

Easy to train with minimal hyper-parameter tuning.

Abstract

Most machine learning-based regressors extract information from data collected via past observations of limited length to make predictions in the future. Consequently, when input to these trained models is data with significantly different statistical properties from data used for training, there is no guarantee of accurate prediction. Consequently, using these models on out-of-distribution input data may result in a completely different predicted outcome from the desired one, which is not only erroneous but can also be hazardous in some cases. Successful deployment of these machine learning models in any system requires a detection system, which should be able to distinguish between out-of-distribution and in-distribution data (i.e. similar to training data). In this paper, we introduce a novel approach for this detection process using a Reduced Robust Random Cut Forest (RRRCF) data…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsMachine Learning and Data Classification · Anomaly Detection Techniques and Applications · Neural Networks and Applications