kruX: Matrix-based non-parametric eQTL discovery
Jianlong Qi, Hassan Foroughi Asl, Johan Bjorkegren, Tom Michoel

TL;DR
kruX is a matrix-based algorithm that significantly accelerates non-parametric eQTL discovery using the Kruskal-Wallis test, enabling robust genome-wide association analysis on large datasets without high-performance computing.
Contribution
We developed kruX, a fast, matrix-based implementation of the Kruskal-Wallis test for efficient, large-scale non-parametric eQTL mapping in genomics.
Findings
kruX is over 10,000 times faster than traditional methods.
The Kruskal-Wallis test detects more non-linear associations.
It is more robust to outliers and heterogeneous data.
Abstract
The Kruskal-Wallis test is a popular non-parametric statistical test for identifying expression quantitative trait loci (eQTLs) from genome-wide data due to its robustness against variations in the underlying genetic model and expression trait distribution, but testing billions of marker-trait combinations one-by-one can become computationally prohibitive. We developed kruX, an algorithm implemented in Matlab, Python and R that uses matrix multiplications to simultaneously calculate the Kruskal-Wallis test statistic for several millions of marker-trait combinations at once. KruX is more than ten thousand times faster than computing associations one-by-one on a typical human dataset. We used kruX and a dataset of more than 500k SNPs and 20k expression traits measured in 102 human blood samples to compare eQTLs detected by the Kruskal-Wallis test to eQTLs detected by the parametric ANOVA…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
