pyBioSig: optimizing group discrimination using genetic algorithms for biosignature discovery
Frederico G. C. Arnoldi, Rodrigo F. Rodrigues, Celio L. Silva

TL;DR
This paper introduces pyBioSig, a Python and R tool that uses genetic algorithms to identify small, effective biosignatures for distinguishing multiple biological groups, improving robustness over single-molecule biomarkers.
Contribution
The work presents a novel multivariate filter using genetic algorithms for biosignature discovery, implemented in pyBioSig, enabling efficient identification of gene sets that discriminate multiple groups.
Findings
Successfully identified small gene biosignatures that recover multiple groups.
Demonstrated the method's effectiveness in hierarchical clustering.
Achieved feasible computation times on personal computers.
Abstract
In medical sciences, a biomarker is "a characteristic that is objectively measured and evaluated as an indicator of normal biological processes, pathogenic processes, or pharmacologic responses to a therapeutic intervention". Molecular experiments are providing rapid and systematic approaches to search for biomarkers, but because single-molecule biomarkers have shown a disappointing lack of robustness for clinical diagnosis, researchers have begun searching for distinctive sets of molecules, called "biosignatures". However, the most popular statistics are not appropriate for their identification, and the number of possible biosignatures to be tested is frequently intractable. In the present work, we developed a "multivariate filter" using genetic algorithms (GA) as a feature (gene) selector to optimize a measure of intra-group cohesion and inter-group dispersion. This method was…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification · Single-cell and spatial transcriptomics
