Sparse semiparametric discriminant analysis for high-dimensional zero-inflated data
Hee Cheol Chung, Yang Ni, Irina Gaynanova

TL;DR
This paper introduces a novel semiparametric discriminant analysis method tailored for high-dimensional zero-inflated biological data, effectively handling skewness and zero inflation for improved classification accuracy.
Contribution
The paper proposes a new sparse semiparametric discriminant analysis framework based on a truncated latent Gaussian copula model, addressing limitations of Gaussian assumptions in zero-inflated data.
Findings
Outperforms existing methods on simulated data
Successfully classifies Crohn's disease microbiome data
Identifies influential microbial genera
Abstract
Sequencing-based technologies provide an abundance of high-dimensional biological datasets with skewed and zero-inflated measurements. Classification of such data with linear discriminant analysis leads to poor performance due to the violation of the Gaussian distribution assumption. To address this limitation, we propose a new semiparametric discriminant analysis framework based on the truncated latent Gaussian copula model that accommodates both skewness and zero inflation. By applying sparsity regularization, we demonstrate that the proposed method leads to the consistent estimation of classification direction in high-dimensional settings. On simulated data, the proposed method shows superior performance compared to the existing method. We apply the method to discriminate healthy controls from patients with Crohn's disease based on microbiome data and to identify genera with the most…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Cancer-related molecular mechanisms research · Statistical Methods and Inference
