Genetic Classification of Populations using Supervised Learning
M. Bridges, E. A. Heron, C. O'Dushlaine, R. Segurado, The, International Schizophrenia Consortium (ISC), D. Morris, A. Corvin, M. Gill,, C. Pinto

TL;DR
This paper demonstrates that supervised learning methods like neural networks and support vector machines outperform traditional unsupervised techniques in classifying populations based on genetic data, especially in complex scenarios.
Contribution
The study shows that supervised approaches significantly improve population classification accuracy over PCA, surpassing theoretical limits of unsupervised methods in genetics.
Findings
Supervised methods outperform PCA in population classification.
Neural networks and SVMs distinguish closely related populations.
Supervised approaches exceed theoretical sensitivity limits of unsupervised methods.
Abstract
There are many instances in genetics in which we wish to determine whether two candidate populations are distinguishable on the basis of their genetic structure. Examples include populations which are geographically separated, case--control studies and quality control (when participants in a study have been genotyped at different laboratories). This latter application is of particular importance in the era of large scale genome wide association studies, when collections of individuals genotyped at different locations are being merged to provide increased power. The traditional method for detecting structure within a population is some form of exploratory technique such as principal components analysis. Such methods, which do not utilise our prior knowledge of the membership of the candidate populations. are termed \emph{unsupervised}. Supervised methods, on the other hand are able to…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
