# neomerDB: a comprehensive database of neomer biomarkers in cancer

**Authors:** Kimonas Provatas, Candace S Y Chan, Ioannis Kerasiotis, Eleftherios Bochalis, Akshatha Nayak, Brad E Zacharia, Georgios A Pavlopoulos, Wei Li, Ilias Georgakopoulos-Soares

PMC · DOI: 10.1093/database/baag006 · 2026-02-12

## TL;DR

neomerDB is a new database that identifies unique DNA sequences (neomers) linked to cancer, which can help detect and monitor cancer more effectively.

## Contribution

The novel contribution is the creation of neomerDB, a comprehensive database of neomer biomarkers across cancer types and organs.

## Key findings

- Neomers were identified across cancer types and organs using thousands of tumor-matched sequencing samples.
- A case study showed neomers can detect glioblastoma in liquid biopsy samples with high accuracy (AUC 0.98, precision-recall 0.99).
- The database includes population-wide filtering to exclude germline-derived nullomers and neomers.

## Abstract

The development of biomarkers for population screening, early cancer detection, monitoring, and recurrence surveillance offers substantial potential to improve patient outcomes and save lives. Nullomers are short k-mers that are absent from a human genome, and neomers are the subset of nullomers that emerge recurrently due to somatic mutations during cancer development. Here, we have developed neomerDB, a database that encompasses a catalogue of neomers across cancer types and organs. We examined 10 000 whole exome sequencing and 2658 whole genome sequencing tumour-matched samples and identified the set of neomers associated with each cancer type and organ. We also analysed 76 215 whole genomes and 730 947 whole exomes of individuals from diverse ancestries, from which we removed nullomers and neomers that can arise due to germline variants in the population. Finally, we conducted a case study demonstrating that neomers can be utilized to detect glioblastoma from liquid biopsy samples (n = 38), utilizing cell-free DNA and cell-free RNA, achieving a Receiver Operating Characteristic - Area Under the Curve score of 0.98 and a precision-recall score of 0.99. neomerDB is a user-friendly database that enables advanced searches, provides interactive visualizations, and download options for neomer biomarkers. neomerDB is publicly available at https://neomerDB.com/.

## Linked entities

- **Diseases:** glioblastoma (MONDO:0018177), cancer (MONDO:0004992)

## Full-text entities

- **Diseases:** glioblastoma (MESH:D005909), cancer (MESH:D009369)
- **Species:** Homo sapiens (human, species) [taxon 9606]

## Figures

5 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12895195/full.md

---
Source: https://tomesphere.com/paper/PMC12895195