Robust Classification of High Dimension Low Sample Size Data
Necla Gunduz, Ernest Fokoue

TL;DR
This paper compares various robust classification techniques in high-dimensional, low-sample-size settings, revealing that Random Forest outperforms specialized robust methods in predictive accuracy on diverse datasets.
Contribution
It provides a comprehensive comparison of robust discriminant analysis, robust PCA, and Random Forest, highlighting the superior predictive performance of Random Forest in high-dimensional low sample size data.
Findings
Random Forest outperforms other robust methods in predictive accuracy.
Robust PCA and discriminant analysis show limited robustness in high-dimensional settings.
Random Forest is effective despite not being specifically designed for robustness.
Abstract
The robustification of pattern recognition techniques has been the subject of intense research in recent years. Despite the multiplicity of papers on the subject, very few articles have deeply explored the topic of robust classification in the high dimension low sample size context. In this work, we explore and compare the predictive performances of robust classification techniques with a special concentration on robust discriminant analysis and robust PCA applied to a wide variety of large small data sets. We also explore the performance of random forest by way of comparing and contrasting the differences single model methods and ensemble methods in this context. Our work reveals that Random Forest, although not inherently designed to be robust to outliers, substantially outperforms the existing techniques specifically designed to achieve robustness. Indeed, random forest…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAdvanced Statistical Methods and Models · Spectroscopy and Chemometric Analyses · Statistical Methods and Inference
