# Draft genome dataset of Xylaria sp. KR-3U isolated from leaves of the medicinal plant Catharanthus roseus

**Authors:** Kankana Roy, Abhijit Bandyopadhyay

PMC · DOI: 10.1016/j.dib.2026.112543 · Data in Brief · 2026-02-07

## TL;DR

This paper provides a draft genome sequence of Xylaria sp. KR-3U, an endophyte from Catharanthus roseus leaves in India, including gene annotations and biosynthetic clusters.

## Contribution

The study presents a high-quality draft genome and functional analysis of Xylaria sp. KR-3U, including biosynthetic gene clusters and enzyme profiles.

## Key findings

- The genome assembly has 11,916 predicted protein coding genes with 97.0% completeness based on BUSCO analysis.
- The dataset includes 111 biosynthetic gene clusters and 556 CAZyme-encoding genes, with 39.74% predicted to be secreted.
- Genome data and annotations are publicly available through NCBI and Mendeley Data for reuse and transparency.

## Abstract

We present a draft genome dataset for Xylaria sp. (KR-3U) isolated as an endophyte from Catharanthus roseus leaves in India. Whole genome sequencing was performed using Illumina NovaSeq 6000 platform, generating 35.2 million paired-end raw reads (150 bp), providing ∼120× coverage (∼5.32 Gb of raw data) for a 44.24 Mb assembly (960 contigs >1 kb, GC content of 47.76%, and an N50 of 101,126 bp). Read remapping showed 96.07% alignment to the assembly. BUSCO (fungi_odb10) analysis indicated 97.0% completeness. Gene prediction using AUGUSTUS identified 11,916protein coding genes. BLASTp searches against the Swiss-Prot database yielded significant hits for 7299 proteins, of which 7204 were mapped to Gene Ontology (GO) terms and 5869 sequences received functional annotations . Integration of InterProScan-supported annotations resulted in 5645 proteins assigned at least one GO term. KEGG KAAS assigned 4,144genes (3,391KO numbers) to diverse pathways. Carbohydrate active enzyme (CAZyme) analysis revealed 556 CAZyme encoding genes, with 39.74% (221 genes) predicted to be secreted. antiSMASH detected 111 biosynthetic gene clusters (BGCs), including polyketide synthases (PKS), non-ribosomal peptide synthetases (NRPS), terpenes and hybrid clusters. The complete genome sequence and raw reads have been deposited in National Center for Biotechnology Information (NCBI) under the GenBank accession number JBSEFG000000000, BioProject number PRJNA1335662, BioSample ID SAMN52018968, and SRA (raw reads) accession number SRR35731853 The genome assembly (FASTA), gene annotation (GFF3), and secondary genome analysis files are deposited in Mendeley Data (Version 3; DOI: 10.17632/b8jn5rtwkg.3) under a CC BY 4.0 license to ensure transparency, reproducibility, and reuse of the analyses.

## Linked entities

- **Species:** Catharanthus roseus (taxon 4058), Xylaria sp. (taxon 1715255)

## Full-text entities

- **Diseases:** fungal (MESH:D009181)
- **Chemicals:** Carbohydrate (MESH:D002241), streptomycin sulfate (MESH:D013307), BUSCO (-), NaCl (MESH:D012965), Tween-20 (MESH:D011136), isocyanide (MESH:D003486), indole (MESH:C030374), ethanol (MESH:D000431), mercuric chloride (MESH:D008627), phenol (MESH:D019800), water (MESH:D014867), chloroform (MESH:D002725), terpene (MESH:D013729), agarose (MESH:D012685)
- **Species:** Catharanthus roseus (chatas, species) [taxon 4058], Homo sapiens (human, species) [taxon 9606], Histoplasma capsulatum (species) [taxon 5037], Xylaria sp. (species) [taxon 1715255]
- **Cell lines:** KR-3 U — Xenopus laevis (African clawed frog), Spontaneously immortalized cell line (CVCL_VQ55)

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12950487/full.md

## Figures

3 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12950487/full.md

## References

23 references — full list in the complete paper: https://tomesphere.com/paper/PMC12950487/full.md

---
Source: https://tomesphere.com/paper/PMC12950487