Diagonal Discriminant Analysis with Feature Selection for High   Dimensional Data

Sarah Elizabeth Romanes; John Thomas Ormerod; Jean YH Yang

arXiv:1807.01422·stat.ML·July 5, 2018

Diagonal Discriminant Analysis with Feature Selection for High Dimensional Data

Sarah Elizabeth Romanes, John Thomas Ormerod, Jean YH Yang

PDF

Open Access 1 Repo

TL;DR

This paper presents multiDA, a high-dimensional discriminant analysis method that combines feature selection with a hybrid model, improving prediction accuracy and interpretability on complex datasets.

Contribution

Introducing multiDA, a novel hybrid discriminant analysis model with integrated feature selection for high-dimensional data.

Findings

01

Improved prediction accuracy over existing methods

02

Enhanced interpretability of selected features

03

Faster algorithm run time

Abstract

We introduce a new method of performing high dimensional discriminant analysis, which we call multiDA. We achieve this by constructing a hybrid model that seamlessly integrates a multiclass diagonal discriminant analysis model and feature selection components. Our feature selection component naturally simplifies to weights which are simple functions of likelihood ratio statistics allowing natural comparisons with traditional hypothesis testing methods. We provide heuristic arguments suggesting desirable asymptotic properties of our algorithm with regards to feature selection. We compare our method with several other approaches, showing marked improvements in regard to prediction accuracy, interpretability of chosen features, and algorithm run time. We demonstrate such strengths of our model by showing strong classification performance on publicly available high dimensional datasets, as…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

sarahromanes/multiDA
none

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsStatistical Methods and Inference · Gene expression and cancer classification · Advanced Statistical Methods and Models