Improving Covariance-Regularized Discriminant Analysis for EHR-based Predictive Analytics of Diseases
Sijia Yang, Haoyi Xiong, Kaibo Xu, Licheng Wang, Jiang Bian, Zeyi Sun

TL;DR
This paper enhances covariance-regularized LDA for disease prediction from EHR data, addressing high-dimensional, small-sample challenges with a novel De-Sparse classifier that improves accuracy through better covariance estimation.
Contribution
It introduces a theoretical model for LDA accuracy with arbitrary distributions and proposes De-Sparse, a new LDA classifier leveraging De-sparsified Graphical Lasso for improved HDLSS data classification.
Findings
De-Sparse outperforms existing HDLSS LDA methods.
Theoretical bounds explain factors affecting LDA accuracy.
Experimental results validate the effectiveness on EHR datasets.
Abstract
Linear Discriminant Analysis (LDA) is a well-known technique for feature extraction and dimension reduction. The performance of classical LDA, however, significantly degrades on the High Dimension Low Sample Size (HDLSS) data for the ill-posed inverse problem. Existing approaches for HDLSS data classification typically assume the data in question are with Gaussian distribution and deal the HDLSS classification problem with regularization. However, these assumptions are too strict to hold in many emerging real-life applications, such as enabling personalized predictive analysis using Electronic Health Records (EHRs) data collected from an extremely limited number of patients who have been diagnosed with or without the target disease for prediction. In this paper, we revised the problem of predictive analysis of disease using personal EHR data and LDA classifier. To fill the gap, in this…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Traditional Chinese Medicine Studies · Face and Expression Recognition
MethodsLinear Discriminant Analysis
