Sampling Method for Fast Training of Support Vector Data Description

Arin Chaudhuri; Deovrat Kakde; Maria Jahja; Wei Xiao; Hansi Jiang,; Seunghyun Kong; Sergiy Peredriy

arXiv:1606.05382·cs.LG·November 2, 2018

Sampling Method for Fast Training of Support Vector Data Description

Arin Chaudhuri, Deovrat Kakde, Maria Jahja, Wei Xiao, Hansi Jiang,, Seunghyun Kong, Sergiy Peredriy

PDF

TL;DR

This paper introduces a sampling-based iterative method to significantly speed up SVDD training for large datasets, maintaining good data description quality in outlier detection tasks.

Contribution

A novel sampling method for SVDD training that reduces computation time while preserving description accuracy in large-scale outlier detection.

Findings

01

Method is extremely fast compared to traditional SVDD.

02

Provides a good data description with reduced computation.

03

Effective for big-data process-monitoring applications.

Abstract

Support Vector Data Description (SVDD) is a popular outlier detection technique which constructs a flexible description of the input data. SVDD computation time is high for large training datasets which limits its use in big-data process-monitoring applications. We propose a new iterative sampling-based method for SVDD training. The method incrementally learns the training data description at each iteration by computing SVDD on an independent random sample selected with replacement from the training data set. The experimental results indicate that the proposed method is extremely fast and provides a good data description .

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.