Probabilistic Modeling for Novelty Detection with Applications to Fraud Identification
R\'emi Domingues

TL;DR
This paper explores probabilistic models for novelty detection, focusing on mixed-type and sequence data, proposing new methods that improve accuracy, interpretability, scalability, and robustness in anomaly identification tasks.
Contribution
It introduces a probabilistic nonparametric approach using Dirichlet process mixtures and deep Gaussian process autoencoders for enhanced novelty detection.
Findings
Probabilistic nonparametric method outperforms traditional techniques on mixed data.
Autoencoder-based model with deep Gaussian processes improves sequence anomaly detection.
Comprehensive comparison of novelty detection methods across data types.
Abstract
Novelty detection is the unsupervised problem of identifying anomalies in test data which significantly differ from the training set. Novelty detection is one of the classic challenges in Machine Learning and a core component of several research areas such as fraud detection, intrusion detection, medical diagnosis, data cleaning, and fault prevention. While numerous algorithms were designed to address this problem, most methods are only suitable to model continuous numerical data. Tackling datasets composed of mixed-type features, such as numerical and categorical data, or temporal datasets describing discrete event sequences is a challenging task. In addition to the supported data types, the key criteria for efficient novelty detection methods are the ability to accurately dissociate novelties from nominal samples, the interpretability, the scalability and the robustness to anomalies…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnomaly Detection Techniques and Applications · Time Series Analysis and Forecasting · Metabolomics and Mass Spectrometry Studies
