Efficient Canonical Correlation Analysis with Sparsity
Zixuan Wu, Elena Tuzhilina, Claire Donnat

TL;DR
This paper presents ECCAR, a fast, scalable, and provably consistent sparse CCA algorithm that effectively balances computational efficiency and statistical accuracy in high-dimensional data analysis.
Contribution
Introduces ECCAR, a novel sparse CCA method formulated as a reduced-rank regression, eliminating the need for complex projections and enabling practical large-scale applications.
Findings
ECCAR is significantly faster than existing methods.
It provides reliable, interpretable associations in biological datasets.
The method is validated through extensive simulations and real data applications.
Abstract
In high-dimensional settings, Canonical Correlation Analysis (CCA) often fails, and existing sparse methods force an untenable choice between computational speed and statistical rigor. This work introduces a fast and provably consistent sparse CCA algorithm (ECCAR) that resolves this trade-off. We formulate CCA as a high-dimensional reduced-rank regression problem, which allows us to derive consistent estimators with high-probability error bounds without relying on computationally expensive techniques like Fantope projections. The resulting algorithm is scalable, projection-free, and significantly faster than its competitors. We validate our method through extensive simulations and demonstrate its power to uncover reliable and interpretable associations in two complex biological datasets, as well as in an ML interpretability task. Our work makes sparse CCA a practical and trustworthy…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFace and Expression Recognition · Bayesian Methods and Mixture Models · Advanced Clustering Algorithms Research
