Deep Learning for Reference-Free Geolocation for Poplar Trees
Cai W. John, Owen Queen, Wellington Muchero, and Scott J. Emrich

TL;DR
This paper introduces MashNet, a reference-free deep learning model that predicts the geographic origin of poplar trees from unaligned genetic data, offering a faster alternative to traditional genome-based methods.
Contribution
The paper presents MashNet, a novel reference-free deep learning approach for genomic geolocation, reducing computational complexity while maintaining competitive accuracy.
Findings
MashNet achieves 34.0 km^2 error in geolocation.
It performs comparably to the state-of-the-art Locator method.
The approach enables rapid, efficient identification of plant origins for precision agriculture.
Abstract
A core task in precision agriculture is the identification of climatic and ecological conditions that are advantageous for a given crop. The most succinct approach is geolocation, which is concerned with locating the native region of a given sample based on its genetic makeup. Here, we investigate genomic geolocation of Populus trichocarpa, or poplar, which has been identified by the US Department of Energy as a fast-rotation biofuel crop to be harvested nationwide. In particular, we approach geolocation from a reference-free perspective, circumventing the need for compute-intensive processes such as variant calling and alignment. Our model, MashNet, predicts latitude and longitude for poplar trees from randomly-sampled, unaligned sequence fragments. We show that our model performs comparably to Locator, a state-of-the-art method based on aligned whole-genome sequence data. MashNet…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic Mapping and Diversity in Plants and Animals · Genomics and Phylogenetic Studies · Chromosomal and Genetic Variations
