Simultaneous Selection of Multiple Important Single Nucleotide   Polymorphisms in Familial Genome Wide Association Studies Data

Subhabrata Majumdar; Saonli Basu; Matt McGue; Snigdhansu Chatterjee

arXiv:1802.01141·stat.AP·April 30, 2025

Simultaneous Selection of Multiple Important Single Nucleotide Polymorphisms in Familial Genome Wide Association Studies Data

Subhabrata Majumdar, Saonli Basu, Matt McGue, Snigdhansu Chatterjee

PDF

TL;DR

This paper introduces a fast, resampling-based variable selection method for identifying relevant SNPs in family-based genome-wide association studies, improving detection power over traditional single-marker tests.

Contribution

It presents a novel, computationally efficient model selection approach using the e-values framework that accounts for familial dependencies and detects multiple SNPs simultaneously.

Findings

01

More effective in detecting trait-associated SNPs than traditional methods

02

Successfully identified SNPs linked to alcohol consumption in real data

03

Scalable bootstrap procedure enhances computational efficiency

Abstract

We propose a resampling-based fast variable selection technique for detecting relevant single nucleotide polymorphisms (SNP) in a multi-marker mixed effect model. Due to computational complexity, current practice primarily involves testing the effect of one SNP at a time, commonly termed as `single SNP association analysis'. Joint modeling of genetic variants within a gene or pathway may have better power to detect associated genetic variants, especially the ones with weak effects. In this paper, we propose a computationally efficient model selection approach -- based on the e-values framework -- for single SNP detection in families while utilizing information on multiple SNPs simultaneously. To overcome computational bottleneck of traditional model selection methods, our method trains one single model, and utilizes a fast and scalable bootstrap procedure. We illustrate through…

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.