Centromere reference models for human chromosomes X and Y satellite arrays
Karen H. Miga, Yulia Newton, Miten Jain, Nicolas Altemose, Huntington, F. Willard, W. James Kent

TL;DR
This study presents the first linear assemblies of human centromeric satellite arrays on chromosomes X and Y, providing a foundation for understanding these complex, repetitive regions and their variation across populations.
Contribution
It offers the first linear reference models for human centromeric satellite arrays on X and Y chromosomes, enabling better analysis of these challenging genomic regions.
Findings
Assembled 3.83 Mb of centromeric DNA for X and Y chromosomes.
Identified two ancient satellite array variants across populations.
Evaluated sequence variation and mappability within centromeric arrays.
Abstract
The human genome remains incomplete, with multi-megabase sized gaps representing the endogenous centromeres and other heterochromatic regions. These regions are commonly enriched with long arrays of near-identical tandem repeats, known as satellite DNAs, that offer a limited number of variant sites to differentiate individual repeat copies across millions of bases. This substantial sequence homogeneity challenges available assembly strategies, and as a result, centromeric regions are omitted from ongoing genomic studies. To address this problem, we present a locally ordered assembly across two haploid human satellite arrays on chromosomes X and Y, resulting in an initial linear representation of 3.83 Mb of centromeric DNA within an individual genome. To further expand the utility of each centromeric reference sequence, we evaluate sites within the arrays for short-read mappability and…
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsChromosomal and Genetic Variations · Genomic variations and chromosomal abnormalities · Plant Disease Resistance and Genetics
