Towards multiple kernel principal component analysis for integrative analysis of tumor samples
Nora K. Speicher, Nico Pfeifer

TL;DR
This paper introduces a novel unsupervised kernel PCA method for integrating multiple data sources in cancer subtype analysis, improving visualization and clustering without requiring parameter tuning.
Contribution
It proposes a scoring function for impact assessment of data sources in multiple kernel PCA, enabling better integration and visualization of cancer data.
Findings
Effective integration of multiple cancer data types
Enhanced visualization of combined data
Improved clustering results for cancer subtypes
Abstract
Personalized treatment of patients based on tissue-specific cancer subtypes has strongly increased the efficacy of the chosen therapies. Even though the amount of data measured for cancer patients has increased over the last years, most cancer subtypes are still diagnosed based on individual data sources (e.g. gene expression data). We propose an unsupervised data integration method based on kernel principal component analysis. Principal component analysis is one of the most widely used techniques in data analysis. Unfortunately, the straight-forward multiple-kernel extension of this method leads to the use of only one of the input matrices, which does not fit the goal of gaining information from all data sources. Therefore, we present a scoring function to determine the impact of each input matrix. The approach enables visualizing the integrated data and subsequent clustering for…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
