Mapping to a Reference Genome Structure
Benedict Paten, Adam Novak, David Haussler

TL;DR
This paper proposes a framework for representing reference genomes with mapping schemes, including graph-based models to account for genetic variation, enhancing comparative and medical genomics research.
Contribution
It introduces the concept of reference structures, detailing their properties and illustrating how graph-based models can better represent genetic diversity.
Findings
Defined desirable properties of reference structures
Presented examples of reference structures including graphs
Showed how graphs can model genetic variation effectively
Abstract
To support comparative genomics, population genetics, and medical genetics, we propose that a reference genome should come with a scheme for mapping each base in any DNA string to a position in that reference genome. We refer to a collection of one or more reference genomes and a scheme for mapping to their positions as a reference structure. Here we describe the desirable properties of reference structures and give examples. To account for natural genetic variation, we consider the more general case in which a reference genome is represented by a graph rather than a set of phased chromosomes; the latter is treated as a special case.
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenomic variations and chromosomal abnormalities · Chromosomal and Genetic Variations · Genomics and Phylogenetic Studies
