Nextclade data set for the ORF5-based lineage classification of PRRSV-1
Michael Zeller, Jennifer Chang, Giovani Trevisan, Phillip C. Gauger, Jianqiang Zhang

TL;DR
This paper introduces a standardized data set for classifying PRRSV-1 using ORF5, enabling faster and more consistent analysis.
Contribution
The novel contribution is a global nomenclature-based Nextclade dataset for PRRSV-1 lineage classification.
Findings
The dataset allows rapid sequence analysis and comparison with reference strains.
It promotes broader adoption of standardized classification for PRRSV-1 research and surveillance.
Abstract
A Nextclade data set for PRRSV-1 ORF5 based on a global nomenclature for standardized lineage classification was developed. This tool enables rapid sequence analysis, visualization, and comparison with reference strains and vaccines. By providing accessibility, it facilitates broader adoption of PRRSV-1 classification frameworks for research and surveillance.
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Fig 1Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsAnimal Virus Infections Studies · Viral gastroenteritis research and epidemiology · Viral Infections and Immunology Research
ANNOUNCEMENT
Porcine reproductive and respiratory syndrome virus (PRRSV) is an economically significant swine pathogen, first identified as the causative agent of Blue Ear disease in 1991–1992 (1, 2). PRRSV is divided into two species, Betaarterivirus europensis (PRRSV-1) and Betaarterivirus americense (PRRSV-2), with the Lelystad strain (GenBank M96262) serving as the prototype for PRRSV-1 and the VR-2332 strain (GenBank U87392) as the prototype for PRRSV-2. Genetic characterization of PRRSV is primarily focused on open reading frame 5 (ORF5), selected due to its high genetic diversity and the abundance of available sequences worldwide (3, 4). While multiple, progressive nomenclatures exist for PRRSV-2 ORF5 (3, 5–8), PRRSV-1 ORF5 classifications have been more limited and often regionally based (3, 9, 10). Recently, a global sequence-based nomenclature was proposed for PRRSV-1, unifying the previous classification schemes (11). To facilitate the adoption of this global nomenclature and improve usability, we have developed a Nextclade data set for PRRSV-1 ORF5 sequence classification. Nextclade is a web-based tool for rapid lineage assignment and has previously been used to establish an ORF5 data set for PRRSV-2 (12, 13).
All sequences and associated metadata were obtained from the supplemental file provided by Yim-im et al. (11). Metadata included the GenBank accession number, assigned lineage, year of collection, and country. For samples missing the year of collection, the year of GenBank submission was used instead. The Lelystad (M96262) strain was selected as the primary reference sequence due to its historical significance. Additional metadata on nucleotide and amino acid mutations relative to the reference were generated using the augur ancestral and augur translate subcommands. The data set’s phylogenetic tree was midpoint rooted, and colors were assigned for the country, year, and lineage metadata. Five PRRSV-1 vaccine sequences were included in the data set: Porcilis, Pyrsvac, Unistrain/Amervac, PRRSVFLEX EU, and Suvaxyn, each vaccine annotated as an “X” on the tree. Alignment parameters were adjusted to allow a minimum sequence length of 400 nucleotides (~65% of the sequence) and a minSeedCoverage of 0.01.
The final data set consisted of 967 PRRSV-1 ORF5 sequences from 23 countries. The use of Nextclade provides standardization of the classification workflow, by combining the reference set from (11) with the built-in methodology. To use this tool, users simply upload their sequences (Fig. 1). Upon submission, Nextclade generates a table displaying the inferred lineage, sequence quality metrics, and mutations relative to the reference. A secondary screen visualizes the placement of user-submitted sequences on a neighbor-joining tree, allowing for comparison with other sequences, including vaccines. All results are available for direct download to facilitate storage and further analysis. Nextclade provides a user-friendly platform for accurate PRRSV lineage assignment, supporting the adoption of the new PRRSV nomenclature.
The complete tree from the PRRSV-1 Nextclade data set. The tree is colored by lineage, with additional labels added for clarity.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Wensvoort G, Terpstra C, Pol JM, ter Laak EA, Bloemraad M, de Kluyver EP, Kragten C, van Buiten L, den Besten A, Wagenaar F. 1991. Mystery swine disease in The Netherlands: the isolation of Lelystad virus. Vet Q 13:121–130. doi:10.1080/01652176.1991.96942961835211 · doi ↗ · pubmed ↗
- 2Collins JE, Benfield DA, Christianson WT, Harris L, Hennings JC, Shaw DP, Goyal SM, Mc Cullough S, Morrison RB, Joo HS. 1992. Isolation of swine infertility and respiratory syndrome virus (isolate ATCC VR-2332) in North America and experimental reproduction of the disease in gnotobiotic pigs. J Vet Diagn Invest 4:117–126. doi:10.1177/1040638792004002011616975 · doi ↗ · pubmed ↗
- 3Shi M, Lam TT-Y, Hon C-C, Hui RK-H, Faaberg KS, Wennblom T, Murtaugh MP, Stadejek T, Leung FC-C. 2010. Molecular epidemiology of PRRSV: a phylogenetic perspective. Virus Res 154:7–17. doi:10.1016/j.virusres.2010.08.01420837072 · doi ↗ · pubmed ↗
- 4Wesley RD, Mengeling WL, Lager KM, Clouser DF, Landgraf JG, Frey ML. 1998. Differentiation of a porcine reproductive and respiratory syndrome virus vaccine strain from North American field strains by restriction fragment length polymorphism analysis of ORF 5. J Vet Diagn Invest 10:140–144. doi:10.1177/1040638798010002049576340 · doi ↗ · pubmed ↗
- 5Yim-Im W, Anderson TK, Paploski IAD, Vander Waal K, Gauger P, Krueger K, Shi M, Main R, Zhang J. 2023. Refining PRRSV-2 genetic classification based on global ORF 5 sequences and investigation of their geographic distributions and temporal changes. Microbiol Spectr 11:e 0291623. doi:10.1128/spectrum.02916-2337933982 PMC 10848785 · doi ↗ · pubmed ↗
- 6Paploski IAD, Pamornchainavakul N, Makau DN, Rovira A, Corzo CA, Schroeder DC, Cheeran MC-J, Doeschl-Wilson A, Kao RR, Lycett S, Vander Waal K. 2021. Phylogenetic structure and sequential dominance of sub-lineages of PRRSV type-2 lineage 1 in the United States. Vaccines (Basel) 9:608. doi:10.3390/vaccines 906060834198904 PMC 8229766 · doi ↗ · pubmed ↗
- 7Paploski IAD, Corzo C, Rovira A, Murtaugh MP, Sanhueza JM, Vilalta C, Schroeder DC, Vander Waal K. 2019. Temporal dynamics of co-circulating lineages of porcine reproductive and respiratory syndrome virus. Front Microbiol 10:2486. doi:10.3389/fmicb.2019.0248631736919 PMC 6839445 · doi ↗ · pubmed ↗
- 8Vander Waal K, Pamornchainavakul N, Kikuti M, Zhang J, Zeller M, Trevisan G, Rossow S, Schwartz M, Linhares DCL, Holtkamp DJ, da Silva JPH, Corzo CA, Baker JP, Anderson TK, Makau DN, Paploski IAD. 2025. PRRSV-2 variant classification: a dynamic nomenclature for enhanced monitoring and surveillance. m Sphere 10:e 0070924. doi:10.1128/msphere.00709-2439846734 PMC 11852939 · doi ↗ · pubmed ↗
