Novelty Detection in Sequential Data by Informed Clustering and Modeling

Linara Adilova; Siming Chen; Michael Kamp

arXiv:2103.03943·cs.LG·July 11, 2023·1 cites

Novelty Detection in Sequential Data by Informed Clustering and Modeling

Linara Adilova, Siming Chen, Michael Kamp

PDF

Open Access 1 Repo

TL;DR

This paper introduces an informed clustering approach for novelty detection in discrete sequences, leveraging domain expertise and LSTM models to improve detection accuracy over traditional methods.

Contribution

The paper presents a novel informed clustering method combined with LSTM modeling that enhances novelty detection in discrete sequences, outperforming existing approaches.

Findings

01

Informed clustering outperforms automatic clustering.

02

Decomposition improves detection despite less data per cluster.

03

Approach outperforms state-of-the-art methods in real-world scenarios.

Abstract

Novelty detection in discrete sequences is a challenging task, since deviations from the process generating the normal data are often small or intentionally hidden. Novelties can be detected by modeling normal sequences and measuring the deviations of a new sequence from the model predictions. However, in many applications data is generated by several distinct processes so that models trained on all the data tend to over-generalize and novelties remain undetected. We propose to approach this challenge through decomposition: by clustering the data we break down the problem, obtaining simpler modeling task in each cluster which can be modeled more accurately. However, this comes at a trade-off, since the amount of training data per cluster is reduced. This is a particular problem for discrete sequences where state-of-the-art models are data-hungry. The success of this approach thus…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

kampmichael/noveltydetectionsequentialdata
pytorchOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsAnomaly Detection Techniques and Applications · Data-Driven Disease Surveillance · Data Visualization and Analytics