Writing the dark matter of the human genome into mice to better replicate human disease
David M Truong

Abstract
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
- —National Institute of Allergy and Infectious Diseases10.13039/100000060
- —National Institute of Allergy and Infectious Diseases10.13039/100000060
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsCRISPR and Genetic Engineering · Pluripotent Stem Cells Research · Animal Genetics and Reproduction
Since revealing the 3.2 billion nucleotide human genome sequence, researchers have speculated on the function of the 99% of the genome that does not code for protein—which some refer to as the ‘dark matter’ of the genome. These vast stretches of DNA hide the regulatory ‘codes’ for telling genes when, where and by how much to be expressed in every cell type in the body. We still do not fully understand this code, which notably contains ∼90% of human disease-associated DNA variations. Now, Zhang et al. (1), working in Jef Boeke’s group at New York University Grossman School of Medicine in the USA, developed a way to study these human regulatory codes by genetically ‘writing’ as much as 180 kilobases of human DNA in place of the corresponding mouse regions in mouse embryonic stem (ES) cells.
This is an exciting development for those using mice as models for human disease. For many diseases, the mouse genes do not accurately replicate the phenotypes found in humans. Some genes do not even exist. While mice engineered with human genes have been used for decades, these models are limited by the current precision genome editing size limits of around 10 kilobases. More importantly, these models are limited to using the mouse regulatory code for gene expression, which often fails to reflect human biology.
Regulatory DNA comprises the gene promoter along with randomly scattered bits of sequence called enhancers hidden across huge swathes of DNA anywhere from 5 to 100s of kilobases away from the gene. These sequences and patterns differ greatly between humans and mice, resulting in genes being expressed in different cell types and at different levels. The new technique, called ‘mSwap-In’, enables the introduction of nearly 20 times more DNA in a site-specific location compared to previous methods, enough to include all the human codes with its genes into mice.
The researchers demonstrated the scope and use of their method by engineering a human ACE2 mouse model that better modeled human COVID-19 infection patterns. The newly developed method, mSwap-In, thereby uses a combination of CRISPR/Cas9, synthetic DNA assembly and antibiotic markers to write the 180 kilobase human ACE2 region in place of the comparatively smaller 72 kilobase region into the ES cells of mice. These engineered mouse ES cells were then rapidly turned into full-grown mice using a method called tetraploid complementation.
The ACE2 gene makes the key protein found on cells allowing the corona virus SARS-CoV-2 to enter the cell. Mice are naturally resistant to SARS-CoV-2; whereas a previously made engineered human ACE2 mouse model expressing too much ACE2 protein on the cell surface resulted in the mice dying too fast from COVID (2). In contrast, the newly engineered mouse model had mild infection symptoms more consistent with those observed in humans (1). Moreover, the cell and tissue expression patterns more accurately reflected those found in humans, with ACE2 now found in the mouse testis and expressed at lower levels in the lung. This is consistent with the differences in the regulatory code.
Mouse models like these may help reduce the failure rate of therapeutics, which are often tested first in normal mice. It also ushers in the era of mammalian genome writing. The next challenge is genetic writing at this scale in human cells. The authors state that mSwap-In would likely work for other species. It should be noted that human pluripotent stem cells (hPSCs) die more easily in response to CRISPR/Cas9 editing (3). A similar method used in hPSCs by the Aizawa group in Japan only identified three cell clones with a 10 kilobase integration (4). Given the cellular differences, it will be interesting to know whether it’s possible to do the same for human stem cells.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Zhang W. , Golynker I., Brosh R., Fajardo A., Zhu Y., Wudzinska A.M., Ordonez R., Ribeiro-Dos-Santos A.M., Carrau L., Damani-Yokota P. et al. (2023) Mouse genome rewriting and tailoring of three important disease loci. Nature, 623, 423–431.37914927 10.1038/s 41586-023-06675-4PMC 10632133 · doi ↗ · pubmed ↗
- 2Zheng J. , Wong L.R., Li K., Verma A.K., Ortiz M.E., Wohlford-Lenane C., Leidinger M.R., Knudson C.M., Meyerholz D.K., Mc Cray P.B. Jr. et al. (2021) COVID-19 treatments and pathogenesis including anosmia in K 18-h ACE 2 mice. Nature, 589, 603–607.33166988 10.1038/s 41586-020-2943-z PMC 7855185 · doi ↗ · pubmed ↗
- 3Ihry R.J. , Worringer K.A., Salick M.R., Frias E., Ho D., Theriault K., Kommineni S., Chen J., Sondey M., Ye C. et al. (2018) p 53 inhibits CRISPR-Cas 9 engineering in human pluripotent stem cells. Nat. Med., 24, 939–946.29892062 10.1038/s 41591-018-0050-6 · doi ↗ · pubmed ↗
- 4Ohno T. , Akase T., Kono S., Kurasawa H., Takashima T., Kaneko S. and Aizawa Y. (2022) Biallelic and gene-wide genomic substitution for endogenous intron and retroelement mutagenesis in human cells. Nat. Commun., 13, 4219.10.1038/s 41467-022-31982-1PMC 930442435864085 · doi ↗ · pubmed ↗
