# DDBJ update in 2025: system integration for global data-sharing including pathogen surveillance

**Authors:** Takeshi Ara, Yuichi Kodama, Takatomo Fujisawa, Takehide Kosuge, Kyungbum Lee, Jun Mashima, Osamu Ogasawara, Yasuhiro Tanizawa, Tomoya Tanjo, Yasukazu Nakamura, Masanori Arita

PMC · DOI: 10.1093/nar/gkaf1273 · Nucleic Acids Research · 2025-11-24

## TL;DR

The DDBJ updated its infrastructure and collaborations in 2024 to improve global biological data sharing and pathogen surveillance.

## Contribution

New system integration and infrastructure upgrades for enhanced global data-sharing and pathogen surveillance.

## Key findings

- Mandatory metadata standards improved data quality and transparency.
- Collaborations with Korea and China enhanced regional data resilience.
- New high-performance computing infrastructure supports AI-driven analyses.

## Abstract

The Bioinformation and DNA Data Bank of Japan Center (https://www.ddbj.nig.ac.jp/) continues to serve as a global core infrastructure for biological information as part of the International Nucleotide Sequence Database Collaboration. In 2024, we reinforced data quality and transparency through mandatory metadata standards, including sampling geolocation and date, aligning with international debates on Digital Sequence Information. Our repositories expanded across multiple omics layers, and our secure environment for analysis of personal genome provides tools and precomputed data on personal genomes archived at the Japanese Genotype-phenotype Archive. International collaboration was advanced through metadata harmonization with the Korea Bioinformation Center and the China National Genomics Data Center, which strengthened regional data resilience and integration. Inside Japan, we began a new collaboration with the Japan Institute of Health Security\, which facilitated the systematic release of pathogen genomes via the Pathogens.jp portal. To support these expanding activities, our high-performance computing infrastructure was renewed with 14 000 CPU cores, 50 PB Lustre storage, and newly deployed GPU nodes. These updates enable both AI-driven analyses and cost-efficient large-scale genome reanalysis.

Graphical Abstract

## Full-text entities

- **Diseases:** infected (MESH:D007239), COVID-19 (MESH:D000086382), INfectious Disease (MESH:D003141), DSI (MESH:C564078)
- **Species:** Homo sapiens (human, species) [taxon 9606], Severe acute respiratory syndrome coronavirus 2 (no rank) [taxon 2697049]
- **Mutations:** L40S

## Full text

_Full body text omitted from this summary view._ Fetch the complete paper as Markdown: https://tomesphere.com/paper/PMC12807687/full.md

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC12807687/full.md

## References

16 references — full list in the complete paper: https://tomesphere.com/paper/PMC12807687/full.md

---
Source: https://tomesphere.com/paper/PMC12807687