Contextual Unsupervised Outlier Detection in Sequences

Mohamed A. Zahran; Leonardo Teixeira; Vinayak Rao; Bruno Ribeiro

arXiv:2111.03808·cs.LG·November 9, 2021

Contextual Unsupervised Outlier Detection in Sequences

Mohamed A. Zahran, Leonardo Teixeira, Vinayak Rao, Bruno Ribeiro

PDF

Open Access

TL;DR

This paper introduces an unsupervised sequence outlier detection framework that combines ranking tests with user models, effectively identifying anomalies at a specified false positive rate and demonstrating improved accuracy on real-world datasets.

Contribution

It presents a novel, parameter-free unsupervised method for sequence outlier detection that integrates ranking tests with user sequence models, applicable to large-scale real-world data.

Findings

01

Improved outlier detection accuracy over existing methods.

02

Identified user behavior patterns in social media sharing.

03

Demonstrated effectiveness on both real and simulated datasets.

Abstract

This work proposes an unsupervised learning framework for trajectory (sequence) outlier detection that combines ranking tests with user sequence models. The overall framework identifies sequence outliers at a desired false positive rate (FPR), in an otherwise parameter-free manner. We evaluate our methodology on a collection of real and simulated datasets based on user actions at the websites last.fm and msnbc.com, where we know ground truth, and demonstrate improved accuracy over existing approaches. We also apply our approach to a large real-world dataset of Pinterest and Facebook users, where we find that users tend to re-share Pinterest posts of Facebook friends significantly more than other types of users, pointing to a potential influence of Facebook friendship on sharing behavior on Pinterest.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Network Security and Intrusion Detection · Time Series Analysis and Forecasting