Detecting and Classifying Outliers in Big Functional Data

Oluwasegun Taiwo Ojo; Antonio Fern\'andez Anta; Rosa E. Lillo; Carlo; Sguera

arXiv:1912.07287·stat.ME·October 15, 2021·Adv. Data Anal. Classif.

Detecting and Classifying Outliers in Big Functional Data

Oluwasegun Taiwo Ojo, Antonio Fern\'andez Anta, Rosa E. Lillo, Carlo, Sguera

PDF

1 Repo

TL;DR

This paper introduces two scalable outlier detection methods, Semifast-MUOD and Fast-MUOD, based on MUOD, which effectively identify and classify outliers in large functional datasets with improved accuracy and computational efficiency.

Contribution

The paper presents two novel outlier detection methods for big functional data, enhancing MUOD with sampling and median-based approaches for better performance and speed.

Findings

01

Fast-MUOD handles large datasets efficiently with minimal computational time.

02

Proposed methods outperform MUOD in detection accuracy and speed.

03

Effective in diverse applications like weather, population, and video data.

Abstract

We propose two new outlier detection methods, for identifying and classifying different types of outliers in (big) functional data sets. The proposed methods are based on an existing method called Massive Unsupervised Outlier Detection (MUOD). MUOD detects and classifies outliers by computing for each curve, three indices, all based on the concept of linear regression and correlation, which measure outlyingness in terms of shape, magnitude and amplitude, relative to the other curves in the data. 'Semifast-MUOD', the first method, uses a sample of the observations in computing the indices, while 'Fast-MUOD', the second method, uses the point-wise or $L_{1}$ median in computing the indices. The classical boxplot is used to separate the indices of the outliers from those of the typical observations. Performance evaluation of the proposed methods using simulated data show significant…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

otsegun/fastmuod
noneOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.