A deep learning classifier for local ancestry inference

Matthew Aguirre; Jan Sokol; Guhan Venkataraman; Alexander Ioannidis

arXiv:2011.02081·q-bio.GN·November 5, 2020

A deep learning classifier for local ancestry inference

Matthew Aguirre, Jan Sokol, Guhan Venkataraman, Alexander Ioannidis

PDF

Open Access 1 Repo

TL;DR

This paper introduces a novel deep learning approach for local ancestry inference by framing it as an image segmentation task, achieving accuracy comparable to established methods.

Contribution

It develops a deep convolutional neural network with an encoder-decoder architecture for LAI, a new formulation not previously explored.

Findings

01

Model achieves near-gold standard accuracy

02

Learns admixture as a zero-shot task

03

Effective on simulated admixed data

Abstract

Local ancestry inference (LAI) identifies the ancestry of each segment of an individual's genome and is an important step in medical and population genetic studies of diverse cohorts. Several techniques have been used for LAI, including Hidden Markov Models and Random Forests. Here, we formulate the LAI task as an image segmentation problem and develop a new LAI tool using a deep convolutional neural network with an encoder-decoder architecture. We train our model using complete genome sequences from 982 unadmixed individuals from each of five continental ancestry groups, and we evaluate it using simulated admixed data derived from an additional 279 individuals selected from the same populations. We show that our model is able to learn admixture as a zero-shot task, yielding ancestry assignments that are nearly as accurate as those from the existing gold standard tool, RFMix.

Peer Reviews

No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.

Code & Models

Repositories

maguirre1/deepLAI
tfOfficial

Videos

No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.

Taxonomy

TopicsForensic and Genetic Research · Genetic Associations and Epidemiology · Molecular Biology Techniques and Applications