# AnnCovDB: a manually curated annotation database for mutations in SARS-CoV-2 spike protein

**Authors:** Xiaomin Zhang, Zhongyi Lei, Jiarong Zhang, Tingting Yang, Xian Liu, Jiguo Xue, Ming Ni

PMC · DOI: 10.1093/database/baaf002 · 2025-02-12

## TL;DR

AnnCovDB is a manually curated database that organizes functional annotations of mutations in the SARS-CoV-2 spike protein to help understand their effects on infection and transmission.

## Contribution

The novel contribution is a structured, manually curated database of functional annotations for SARS-CoV-2 spike mutations based on published literature.

## Key findings

- AnnCovDB includes 2093 manually curated functional annotations for 205 single and 93 multiple mutations in the S protein.
- The database organizes annotations into hierarchical categories for user-friendly exploration of mutation effects.

## Abstract

Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has been circulating and adapting within the human population for >4 years. A large number of mutations have occurred in the viral genome, resulting in significant variants known as variants of concern (VOCs) and variants of interest (VOIs). The spike (S) protein harbors many of the characteristic mutations of VOCs and VOIs, and significant efforts have been made to explore functional effects of the mutations in the S protein, which can cause or contribute to viral infection, transmission, immune evasion, pathogenicity, and illness severity. However, the knowledge and understanding are dispersed throughout various publications, and there is a lack of a well-structured database for functional annotation that is based on manual curation. AnnCovDB is a database that provides manually curated functional annotations for mutations in the S protein of SARS-CoV-2. Mutations in the S protein carried by at least 8000 variants in the GISAID were chosen, and the mutations were then utilized as query keywords to search in the PubMed database. The searched publications revealed that 2093 annotation entities for 205 single mutations and 93 multiple mutations were manually curated. These entities were organized into multilevel hierarchical categories for user convenience. For example, one annotation entity of N501Y mutation was ‘Infectious cycle➔Attachment➔ACE2 binding affinity➔Increase’. AnnCovDB can be used to query specific mutations and browse through function annotation entities.

Database URL: https://AnnCovDB.app.bio-it.tech/

## Linked entities

- **Proteins:** CHMP5 (charged multivesicular body protein 5), ACE2 (angiotensin converting enzyme 2)
- **Diseases:** severe acute respiratory syndrome coronavirus 2 (MONDO:0100096), SARS-CoV-2 (MONDO:0100096)

## Full-text entities

- **Genes:** ACE2 (angiotensin converting enzyme 2) [NCBI Gene 59272] {aka ACEH}, S (surface glycoprotein) [NCBI Gene 43740568] {aka spike glycoprotein}
- **Diseases:** viral infection (MESH:D014777)
- **Species:** Severe acute respiratory syndrome coronavirus 2 (no rank) [taxon 2697049], Homo sapiens (human, species) [taxon 9606]
- **Mutations:** N501Y

## Figures

1 figure with captions in the complete paper: https://tomesphere.com/paper/PMC11817795/full.md

---
Source: https://tomesphere.com/paper/PMC11817795