Asymmetric integration of various cancer datasets for identifying risk-associated variants and genes
Ruixuan Wang, Lam Tran, Benjamin Brennan, Lars G Fritsche, Kevin He, J Chad Brenner, Hui Jiang

TL;DR
This paper introduces a new method to combine cancer datasets, improving the identification of genes linked to cancer risk.
Contribution
The novel asymmetric integration method enhances statistical power by handling data heterogeneity and excluding unhelpful datasets.
Findings
The integrated analysis identified more genetic variants and genes associated with cancer risks at the same false discovery rate.
The method successfully handles matched case–control study designs using conditional logistic regression models.
Abstract
Cancer genomic research provides an opportunity to identify cancer risk-associated genes, but often suffers from undesirable low statistical power due to a limited sample size. Integrated analysis with different cancers has the potential to enhance statistical power for identifying pan-cancer risk genes. However, substantial heterogeneity across various cancers makes this challenging. Recently, a novel asymmetric integration method was developed that can deal with data heterogeneity and exclude unhelpful datasets from the analysis. We adapted and applied this method to integrate genotype datasets with matched case and control individuals from the Michigan Genomics Initiative, using each cancer as the primary dataset of interest and the other cancers as auxiliary datasets, respectively. Conditional logistic regression models were coupled with the asymmetric integrated framework to…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsBioinformatics and Genomic Networks · Genetics, Bioinformatics, and Biomedical Research
