An outlier map for Support Vector Machine classification
Michiel Debruyne

TL;DR
This paper introduces an outlier map for SVM classification that visualizes data and identifies outliers in high-dimensional kernel spaces, enhancing data quality assessment.
Contribution
It extends the Stahel-Donoho outlyingness measure to kernel spaces and proposes a trimmed SVM for outlier detection and visualization.
Findings
Effective visualization of outliers in high-dimensional data
Application to biological datasets demonstrates practical utility
Improved data quality assessment in SVM classification
Abstract
Support Vector Machines are a widely used classification technique. They are computationally efficient and provide excellent predictions even for high-dimensional data. Moreover, Support Vector Machines are very flexible due to the incorporation of kernel functions. The latter allow to model nonlinearity, but also to deal with nonnumerical data such as protein strings. However, Support Vector Machines can suffer a lot from unclean data containing, for example, outliers or mislabeled observations. Although several outlier detection schemes have been proposed in the literature, the selection of outliers versus nonoutliers is often rather ad hoc and does not provide much insight in the data. In robust multivariate statistics outlier maps are quite popular tools to assess the quality of data under consideration. They provide a visual representation of the data depicting several types of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
