SparCA: Sparse Compressed Agglomeration for Feature Extraction and Dimensionality Reduction
Leland Barnard, Farwa Ali, Hugo Botha, David T. Jones

TL;DR
SparCA is a new hierarchical feature grouping and compression method that produces interpretable features across diverse data types, performing well on supervised tasks without hyperparameter tuning.
Contribution
Introduces SparCA, a novel hyperparameter-free dimensionality reduction technique that generalizes across data types and enhances interpretability and performance.
Findings
Effective on synthetic and real-world datasets
Produces highly interpretable features
Achieves strong supervised learning performance
Abstract
The most effective dimensionality reduction procedures produce interpretable features from the raw input space while also providing good performance for downstream supervised learning tasks. For many methods, this requires optimizing one or more hyperparameters for a specific task, which can limit generalizability. In this study we propose sparse compressed agglomeration (SparCA), a novel dimensionality reduction procedure that involves a multistep hierarchical feature grouping, compression, and feature selection process. We demonstrate the characteristics and performance of the SparCA method across heterogenous synthetic and real-world datasets, including images, natural language, and single cell gene expression data. Our results show that SparCA is applicable to a wide range of data types, produces highly interpretable features, and shows compelling performance on downstream…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsDomain Adaptation and Few-Shot Learning · Single-cell and spatial transcriptomics · COVID-19 diagnosis using AI
MethodsPrincipal Components Analysis · Feature Selection
