# High-Accuracy Long-Read Sequencing of Mycobacterium tuberculosis PSNK363 Isolated From the Democratic People's Republic of Korea

**Authors:** Thi-Binh Dang, Nackmoon Sung, Kyunghyun Lim, Soyoung Lee, Jaehyun Jeon, Sanghoon Jheon

PMC · DOI: 10.1155/cjid/2234550 · The Canadian Journal of Infectious Diseases & Medical Microbiology = Journal Canadien des Maladies Infectieuses et de la Microbiologie Médicale · 2025-02-11

## TL;DR

This paper reports the first high-accuracy genome sequence of a Mycobacterium tuberculosis clinical strain from North Korea, revealing significant genetic differences compared to the commonly used H37Rv reference.

## Contribution

The study provides the first high-fidelity genome assembly of M. tuberculosis PSNK363 from North Korea using long-read sequencing.

## Key findings

- The PSNK363 genome is 10,578 bp longer than H37Rv and contains a large inversion region with deletions and insertions.
- The PSNK363 genome has more protein-coding genes and higher potential for virulence and drug resistance analysis.

## Abstract

Long-read sequencing is a valuable technique for high-precision genome analysis. Despite the widespread use of the Mycobacterium tuberculosis H37Rv genome sequence as a reference for genetic variation analysis, its suitability for comparing clinical strains is limited. Therefore, we constructed the first known whole genome of a clinical M. tuberculosis strain, PSNK363, isolated from the Democratic People's Republic of Korea, using high-quality high-fidelity (HiFi) read sequencing and compared its genetic variations to those of H37Rv. PSNK363 was cultured to obtain genomic DNA, which was subjected to de novo whole-genome assembly using PacBio Sequel II with long-read HiFi sequencing. The sequences were compared to the reference genome H37Rv. HiFi long-read sequencing of M. tuberculosis PSNK363, with an accuracy of 99.99%, revealed a single circular chromosome of 4,422,110 bp, which is 10,578 bp longer than the H37Rv chromosome. The assembly had an average G + C content of 65.6%, 4079 protein-coding sequences, 53 tRNA genes, and 3 rRNA genes. Most genes (72.7%) were assigned as putative functions, whereas the remaining 27.3% were annotated as hypothetical. Comparison with H37Rv revealed a large inversion in the PSNK363 genome, which contains most of the deletion and insertion variants. M. tuberculosis PSNK363 had a longer genome sequence, more protein-coding genes, and a larger inversion region than H37Rv. High-accuracy whole-genome sequencing of PSNK363 holds the potential for enriching virulence databases and identifying informative loci for drug resistance analysis in M. tuberculosis isolates in the Democratic People's Republic of Korea.

## Linked entities

- **Diseases:** tuberculosis (MONDO:0018076)
- **Species:** Mycobacterium tuberculosis (taxon 1773)

## Full-text entities

- **Species:** Mycobacterium tuberculosis (species) [taxon 1773], Mycobacterium tuberculosis H37Rv (strain) [taxon 83332]
- **Cell lines:** H37Rv — Homo sapiens (Human), Prostate carcinoma, Cancer cell line (CVCL_1045)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC11835475/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC11835475/full.md

## References

32 references — full list in the complete paper: https://tomesphere.com/paper/PMC11835475/full.md

---
Source: https://tomesphere.com/paper/PMC11835475