# Unlocking the secrets of SARS-CoV-2 nsp3 by combining experiments with AlphaFold2 domain prediction

**Authors:** Maximilian Edich, Yunyun Gao, David C Briggs, Andrea Thorn

PMC · DOI: 10.26508/lsa.202503247 · 2026-02-20

## TL;DR

Researchers combined AI predictions and experiments to better understand the structure and function of a key SARS-CoV-2 protein.

## Contribution

A previously unknown folded domain in SARS-CoV-2 nsp3 was predicted and experimentally confirmed.

## Key findings

- A novel folded domain in nsp3 was identified and validated experimentally.
- Domain Y1 may play a role in assembling viral proteins into a pore that exports RNA.
- The study proposes a revised domain segmentation and naming system for nsp3.

## Abstract

Combining structural bioinformatics, AI-based fold prediction, and traditional experiments sheds light on the largest SARS-CoV-2 protein.

Nonstructural protein 3 (nsp3) is crucial for SARS-CoV-2 infection. It is the largest protein of the virus with roughly 2000 residues, and a major drug target. However, because of its size, disordered regions, and transmembrane domains, the atomic structure of the whole protein has not yet been established. Only 10 out of its 16 domains were individually determined in experiments. Here, we demonstrate how structural bioinformatics, AI-based fold prediction, and traditional experiments complement each other and can shed light on the makeup of this important protein, both in SARS-CoV-2 and in related viruses. Our method can be generalized for other multidomain proteins. Our prediction-based approach reveals a previously undescribed folded domain, which we could confirm experimentally. Our research also suggests a potential function of the domain Y1: this domain may be involved in the assembly of nsp3, nsp4, and nsp6 into the hexameric pore, which was discovered by electron tomography and exports RNA into the cytosol. The Y1 hexamer, however, could not be expressed on its own. We revise domain segmentation and nomenclature of nsp3 domains.

## Linked entities

- **Proteins:** SH2D3C (SH2 domain containing 3C), PRSS57 (serine protease 57)

## Full-text entities

- **Genes:** PARP14 (poly(ADP-ribose) polymerase family member 14) [NCBI Gene 54625] {aka ARTD8, BAL2, PARP-14, pART8}, ITGAM (integrin subunit alpha M) [NCBI Gene 3684] {aka CD11B, CR3A, HNA-4, MAC-1, MAC1A, MO1A}, ORF1ab (ORF1a polyprotein;ORF1ab polyprotein) [NCBI Gene 43740578], SUMO1P1 (SUMO1 pseudogene 1) [NCBI Gene 391257] {aka PIC1L, UBL2, UBL6}, SUMO1 (small ubiquitin like modifier 1) [NCBI Gene 7341] {aka DAP1, GMP1, OFC10, PIC1, SMT3, SMT3C}, LGALS3 (galectin 3) [NCBI Gene 3958] {aka CBP35, GAL3, GALBP, GALIG, L31, LGALS2}, PPP1CA (protein phosphatase 1 catalytic subunit alpha) [NCBI Gene 5499] {aka PP-1A, PP1A, PP1alpha, PPP1A}, SH2D3A (SH2 domain containing 3A) [NCBI Gene 10045] {aka NSP1}, SPECC1 (sperm antigen with calponin homology and coiled-coil domains 1) [NCBI Gene 92521] {aka CYTSB, HCMOGT-1, HCMOGT1, NSP, NSP5}
- **Diseases:** COVID-19 (MESH:D000086382), infection (MESH:D007239), Coronavirus (MESH:D018352), viral diseases (MESH:D014777), betaSLD (MESH:D000080888)
- **Chemicals:** Hydrogen (MESH:D006859), lipid (MESH:D008055), amino acids (MESH:D000596), Cytiva (-), glycerol (MESH:D005990), disulfide (MESH:D004220), SDS (MESH:D012967), HCl (MESH:D006851), MES (MESH:C004550), water (MESH:D014867), EDTA (MESH:D004492), nitrogen (MESH:D009584), ethylene glycol (MESH:D019855), alanine (MESH:D000409), ADP (MESH:D000244), proline (MESH:D011392), NaCl (MESH:D012965), metal (MESH:D008670), zinc acetate (MESH:D019345), zinc (MESH:D015032), TCEP (MESH:C080938)
- **Species:** Coronaviridae (family) [taxon 11118], Sarbecovirus (subgenus) [taxon 2509511], Betacoronavirus (genus) [taxon 694002], Homo sapiens (human, species) [taxon 9606], Alphacoronavirus (genus) [taxon 693996], Severe acute respiratory syndrome-related coronavirus (no rank) [taxon 694009], Canada goose coronavirus (species) [taxon 2569586], Murine hepatitis virus (no rank) [taxon 11138], Severe acute respiratory syndrome coronavirus 2 (no rank) [taxon 2697049], MHV [taxon 2845560], Gammacoronavirus (genus) [taxon 694013], Deltacoronavirus (genus) [taxon 1159901]
- **Mutations:** C in 0, cysteine-histidine, C111S
- **Cell lines:** coli — Mus musculus (Mouse), Hybridoma (CVCL_C5CN), BL21Gold — Homo sapiens (Human), EBV-related Burkitt lymphoma, Cancer cell line (CVCL_M639)

## Figures

12 figures with captions in the complete paper: https://tomesphere.com/paper/PMC12923375/full.md

---
Source: https://tomesphere.com/paper/PMC12923375