Fair molecular feature selection unveils universally tumor lineage-informative methylation sites in colorectal cancer
Xuan Cindy Li, Yuelin Liu, Alejandro A Schäffer, Stephen M Mount, S Cenk Sahinalp

TL;DR
This paper introduces FALAFL, a new method for fairly selecting molecular features in cancer sequencing data, revealing methylation sites informative across diverse colorectal cancer patients.
Contribution
FALAFL is a novel combinatorial optimization algorithm for fair feature selection in multi-sample sequencing data.
Findings
FALAFL identifies CpG sites well covered across most patients and with high read coverage per patient.
Selected sites show strong tumor lineage-informativeness across diverse patient profiles.
Universally informative sites are enriched in inter-CpG island regions.
Abstract
In the era of precision medicine, performing comparative analysis over diverse patient populations is a fundamental step toward tailoring healthcare interventions. However, the aspect of fairly selecting molecular features across multiple patients is often overlooked. To address this challenge, we introduce FALAFL (FAir muLti-sAmple Feature seLection), an algorithmic approach based on combinatorial optimization. FALAFL is designed to perform feature selection in sequencing data which ensures a balanced selection of features from all patient samples in a cohort. We have applied FALAFL to the problem of selecting lineage-informative CpG sites within a cohort of colorectal cancer patients subjected to low-coverage single-cell methylation sequencing. Our results demonstrate that FALAFL can rapidly and robustly determine the optimal set of CpG sites, which are each well covered by cells…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsEpigenetics and DNA Methylation · Ferroptosis and cancer prognosis · Cancer Genomics and Diagnostics
