Algebraic Comparison of Partial Lists in Bioinformatics
Giuseppe Jurman, Samantha Riccadonna, Roberto Visintainer, Cesare, Furlanello

TL;DR
This paper introduces an algebraic method based on symmetric groups to compare and analyze the stability of partial, possibly unequal-length, feature lists in bioinformatics, aiding in understanding variability across different list outputs.
Contribution
The paper presents a novel algebraic approach for comparing partial lists of unequal lengths, extending stability analysis in bioinformatics feature selection.
Findings
Method effectively measures list stability in synthetic gene filtering data.
Application to prostate cancer data demonstrates practical utility.
Algorithm handles lists of varying lengths and embedded feature sets.
Abstract
The outcome of a functional genomics pipeline is usually a partial list of genomic features, ranked by their relevance in modelling biological phenotype in terms of a classification or regression model. Due to resampling protocols or just within a meta-analysis comparison, instead of one list it is often the case that sets of alternative feature lists (possibly of different lengths) are obtained. Here we introduce a method, based on the algebraic theory of symmetric groups, for studying the variability between lists ("list stability") in the case of lists of unequal length. We provide algorithms evaluating stability for lists embedded in the full feature set or just limited to the features occurring in the partial lists. The method is demonstrated first on synthetic data in a gene filtering task and then for finding gene profiles on a recent prostate cancer dataset.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
