Correlating Cellular Features with Gene Expression using CCA
Vaishnavi Subramanian, Benjamin Chidester, Jian Ma, Minh N. Do

TL;DR
This paper demonstrates the use of canonical correlation analysis to identify meaningful associations between histopathological image features and gene expression in breast cancer, aiding joint phenotype-genotype analysis.
Contribution
It introduces the application of CCA and Sparse CCA for multimodal data integration in cancer research, revealing novel biological insights.
Findings
Significant correlation between image features and PAM50 gene expression.
Sparse CCA uncovers pathway enrichments linked to cancer.
Validated utility of CCA for joint phenotype-genotype analysis.
Abstract
To understand the biology of cancer, joint analysis of multiple data modalities, including imaging and genomics, is crucial. The involved nature of gene-microenvironment interactions necessitates the use of algorithms which treat both data types equally. We propose the use of canonical correlation analysis (CCA) and a sparse variant as a preliminary discovery tool for identifying connections across modalities, specifically between gene expression and features describing cell and nucleus shape, texture, and stain intensity in histopathological images. Applied to 615 breast cancer samples from The Cancer Genome Atlas, CCA revealed significant correlation of several image features with expression of PAM50 genes, known to be linked to outcome, while Sparse CCA revealed associations with enrichment of pathways implicated in cancer without leveraging prior biological understanding. These…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCell Image Analysis Techniques · AI in cancer detection · Gene expression and cancer classification
