Ultrahigh Dimensional Feature Selection via Kernel Canonical Correlation Analysis
Tianqi Liu, Kuang-Yao Lee, and Hongyu Zhao

TL;DR
This paper introduces KCCA-SIS, a novel kernel canonical correlation analysis-based feature screening method for ultrahigh dimensional data, capable of handling nonlinear dependencies without model assumptions, and demonstrates its superior performance in simulations and real gene expression data.
Contribution
The paper develops KCCA-SIS, a new nonlinear, scale-free feature screening method that extends existing techniques and proves its effectiveness in high-dimensional biological data analysis.
Findings
KCCA-SIS has the sure screening property.
KCCA-SIS outperforms existing methods in simulations.
KCCA-SIS identifies relevant genes in brain development study.
Abstract
High-dimensional variable selection is an important issue in many scientific fields, such as genomics. In this paper, we develop a sure independence feature screening pro- cedure based on kernel canonical correlation analysis (KCCA-SIS, for short). KCCA- SIS is easy to be implemented and applied. Compared to the sure independence screen- ing procedure based on the Pearson correlation (SIS, for short) developed by Fan and Lv [2008], KCCA-SIS can handle nonlinear dependencies among variables. Compared to the sure independence screening procedure based on the distance correlation (DC- SIS, for short) proposed by Li et al. [2012], KCCA-SIS is scale free, distribution free and has better approximation results based on the universal characteristic of Gaussian Kernel (Micchelli et al. [2006]). KCCA-SIS is more general than SIS and DC-SIS in the sense that SIS and DC-SIS correspond to certain…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Spectroscopy and Chemometric Analyses · Face and Expression Recognition
