A novel, computationally tractable algorithm flags in big matrices every column associated in any way with others or a dependent variable, with much higher power when columns are linked like mutations in chromosomes
Marcos A. Antezana, Carlos A. Machado

TL;DR
This paper introduces PAS and dvPAS, computationally efficient algorithms for detecting associations in large data matrices, especially effective for linked markers like chromosome mutations, with high power and controlled false positives.
Contribution
The paper presents PAS and dvPAS, novel algorithms that efficiently identify associated columns in large matrices without exhaustive search, outperforming traditional methods in speed and effectiveness.
Findings
PAS and dvPAS maintain uniform type I error rates.
They effectively detect complex multi-column and multi-variable associations.
Power increases significantly with background associations like chromosome mutations.
Abstract
Scanning exhaustively a big data matrix DM for subsets of independent variables IVs that are associated with a dependent variable DV is computationally tractable only for 1- and 2-IV effects. I present a highly computationally tractable Participation-In-Association Score (PAS) that in a DM with markers flags every column that is strongly associated with others. PAS examines no column subsets and its computational cost grows linearly with DM columns, remaining reasonable even in million-column DMs. PAS exploits how associations of markers in DM rows cause matches associations in the rows' pairwise comparisons. For every such comparison with a match at a tested column, PAS computes the other matches by modifying the comparison's total matches (scored once per DM), yielding a distribution of conditional matches that is perturbed by associations of the tested column. Equally tractable is…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGene expression and cancer classification
