A deep learning classifier for local ancestry inference
Matthew Aguirre, Jan Sokol, Guhan Venkataraman, Alexander Ioannidis

TL;DR
This paper introduces a novel deep learning approach for local ancestry inference by framing it as an image segmentation task, achieving accuracy comparable to established methods.
Contribution
It develops a deep convolutional neural network with an encoder-decoder architecture for LAI, a new formulation not previously explored.
Findings
Model achieves near-gold standard accuracy
Learns admixture as a zero-shot task
Effective on simulated admixed data
Abstract
Local ancestry inference (LAI) identifies the ancestry of each segment of an individual's genome and is an important step in medical and population genetic studies of diverse cohorts. Several techniques have been used for LAI, including Hidden Markov Models and Random Forests. Here, we formulate the LAI task as an image segmentation problem and develop a new LAI tool using a deep convolutional neural network with an encoder-decoder architecture. We train our model using complete genome sequences from 982 unadmixed individuals from each of five continental ancestry groups, and we evaluate it using simulated admixed data derived from an additional 279 individuals selected from the same populations. We show that our model is able to learn admixture as a zero-shot task, yielding ancestry assignments that are nearly as accurate as those from the existing gold standard tool, RFMix.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Code & Models
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsForensic and Genetic Research · Genetic Associations and Epidemiology · Molecular Biology Techniques and Applications
