Tackling the dimensions in imaging genetics with CLUB-PLS
Andre Altmann, Ana C Lawry Aguila, Neda Jahanshad, Paul M Thompson,, Marco Lorenzi

TL;DR
This paper introduces CLUB-PLS, a robust Partial Least Squares framework utilizing cluster bootstrap to analyze high-dimensional imaging genetics data, successfully identifying significant genetic loci linked to brain phenotypes in a large cohort.
Contribution
The paper presents a novel CLUB-PLS framework that handles high-dimensional data and large samples, overcoming limitations of traditional GWAS by capturing broader brain-wide effects.
Findings
Identified 107 significant locus-phenotype pairs in UK Biobank data.
High validation rate of loci using GWAS and GWIS methods.
Demonstrated effectiveness of CLUB-PLS in large-scale imaging genetics analysis.
Abstract
A major challenge in imaging genetics and similar fields is to link high-dimensional data in one domain, e.g., genetic data, to high dimensional data in a second domain, e.g., brain imaging data. The standard approach in the area are mass univariate analyses across genetic factors and imaging phenotypes. That entails executing one genome-wide association study (GWAS) for each pre-defined imaging measure. Although this approach has been tremendously successful, one shortcoming is that phenotypes must be pre-defined. Consequently, effects that are not confined to pre-selected regions of interest or that reflect larger brain-wide patterns can easily be missed. In this work we introduce a Partial Least Squares (PLS)-based framework, which we term Cluster-Bootstrap PLS (CLUB-PLS), that can work with large input dimensions in both domains as well as with large sample sizes. One key factor of…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Associations and Epidemiology · Gene expression and cancer classification · Bioinformatics and Genomic Networks
