Obscured-ensemble models for genomic prediction
Rounak Saha, Amir Morshedian, Jia Sun, Robert Duncan, Mike Domaratzki

TL;DR
This paper introduces an obscured-ensemble model for genomic prediction that improves efficiency without sacrificing accuracy.
Contribution
The novel obscured-ensemble model uses selective feature subsets and genotype similarity for efficient genomic prediction.
Findings
Genomic prediction can be achieved using only 20% of obscured markers per genotype without accuracy loss.
The obscured ensemble model performs well even with limited genotype data and random subset selection.
The model avoids shortcut learning by not relying on genomic linkage.
Abstract
Genomic Prediction (GP) uses dense whole-genome marker sets from lines of a crop to predict agronomic traits for untested genotypes. In recent years, deep learning (DL) approaches for genomic prediction have demonstrated state-of-the-art results. However, substantial variation exists in DL outcomes for GP as the success of DL is dependent on the architecture of the model used, as well as the amount of data available and the population structure of the individuals in the training set. In this paper, we consider an obscured model for GP, where the model is not provided with genomic content. The obscured model was intended to evaluate the possibility of so-called shortcut learning in GP.We conclude that we can perform GP using the obscured model with only 20% of the obscured markers from each reference genotype. This selective feature usage significantly enhances the efficiency of our…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16
Figure 17
Figure 18Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic and phenotypic traits in livestock · Genetic Associations and Epidemiology · Genetic Mapping and Diversity in Plants and Animals
