TL;DR
This paper introduces a scalable supervised dimensionality reduction method, LOL, that improves classification accuracy on ultra-high-dimensional biomedical data while maintaining computational efficiency.
Contribution
The paper presents XOX, a novel framework extending PCA with class-conditional moments, and introduces LOL, a simple yet effective supervised reduction technique with theoretical guarantees.
Findings
LOL outperforms existing methods in accuracy on large biomedical datasets
LOL scales efficiently to millions of features within minutes
Theoretical analysis supports the effectiveness of the proposed approach
Abstract
To solve key biomedical problems, experimentalists now routinely measure millions or billions of features (dimensions) per sample, with the hope that data science techniques will be able to build accurate data-driven inferences. Because sample sizes are typically orders of magnitude smaller than the dimensionality of these data, valid inferences require finding a low-dimensional representation that preserves the discriminating information (e.g., whether the individual suffers from a particular disease). There is a lack of interpretable supervised dimensionality reduction methods that scale to millions of dimensions with strong statistical theoretical guarantees.We introduce an approach, XOX, to extending principal components analysis by incorporating class-conditional moment estimates into the low-dimensional projection. The simplest ver-sion, "Linear Optimal Low-rank" projection (LOL),…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
MethodsPrincipal Components Analysis
