FlexLMM: a Nextflow linear mixed model framework for GWAS
Saul Pierotti, Tomas Fitzgerald, Ewan Birney

TL;DR
FlexLMM is a flexible Nextflow pipeline designed for genome-wide association studies using linear mixed models, enabling permutation-based significance testing while accounting for population structure.
Contribution
It introduces a customizable pipeline that performs linear mixed model analysis with permutation testing, specifically addressing population structure issues in GWAS.
Findings
Allows permutation-based significance thresholds in GWAS with population structure
Flexible model specification for linear mixed models
Suitable for multi-parental crosses in model organisms and agriculture
Abstract
Summary: Linear mixed models are a commonly used statistical approach in genome-wide association studies when population structure is present. However, naive permutations to empirically estimate the null distribution of a statistic of interest are not appropriate in the presence of population structure, because the samples are not exchangeable with each other. For this reason we developed FlexLMM, a Nextflow pipeline that runs linear mixed models while allowing for flexibility in the definition of the exact statistical model to be used. FlexLMM can also be used to set a significance threshold via permutations, thanks to a two-step process where the population structure is first regressed out, and only then are the permutations performed. We envision this pipeline will be particularly useful for researchers working on multi-parental crosses among inbred lines of model organisms or farm…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsFault Detection and Control Systems
MethodsSparse Evolutionary Training
