Characterization of a Novel, Highly Divergent Paramyxovirus Discovered in a Bearded Seal of Subarctic Canada
Vadym Zaluzhnyi, Joost T. P. Verhoeven, Garry B. Stenson, Andrew S. Lang, Suzanne C. Dufour, Marta Canuti

TL;DR
A new paramyxovirus was discovered in a bearded seal from Canada, potentially representing a new subfamily of viruses.
Contribution
The discovery of a novel paramyxovirus with a unique genome and phylogenetic placement in a new subfamily.
Findings
BSAPV-1 has a complete coding genome of 15,898 nucleotides and encodes five core paramyxoviral proteins.
Phylogenetic analysis suggests BSAPV-1 is the first member of a novel paramyxoviral subfamily.
The virus was found in a single bearded seal, but its host association remains unclear.
Abstract
Seals are keystone animals in the Arctic and a valuable resource for Indigenous communities, but their virome is poorly understood. Through a preliminary investigation of the virome of seven North Atlantic bearded seals (Erignathus barbatus) from northwest Newfoundland, Canada, we discovered a new member of the Paramyxoviridae, a family including important animal pathogens. The complete coding genome sequence (15,898 nt) of the novel paramyxovirus, which we named bearded seal-associated paramyxovirus 1 (BSAPV-1), encoded five core paramyxoviral proteins—nucleoprotein, matrix, fusion, hemagglutinin-neuraminidase, and polymerase—and three proteins with no identifiable homologues that may represent the phosphoprotein, a small hydrophobic protein, and a transmembrane protein. Phylogenetic analysis, including BSAPV-1 and all 153 currently known paramyxoviral species, positioned the novel…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2- —Natural Sciences and Research Council of Canada (NSERC)
- —Canada First Research Excellence Fund
- —Memorial University School of Graduate Studies
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsVirology and Viral Diseases · Marine animal studies overview · Ichthyology and Marine Biology
1. Introduction
Seals are ecologically and culturally relevant marine mammals. They engage in complex ecological relationships with marine and terrestrial organisms, and their meat and pelts are valuable resources for Indigenous communities [1]. With a lifespan exceeding 25 years, seals can serve as reservoirs for viruses in nature and may act as a bridge connecting land and sea. Many seal species migrate over long distances, potentially spreading viruses along their routes, with some gathering in large groups during migration, which promotes virus transmission within their populations [2,3,4]. Bearded seals (Erignathus barbatus), listed in 2024 as near threatened by the IUCN, are distributed across Arctic and subarctic regions and are the largest northern phocid [5]. They are preyed upon by polar bears, killer whales, sharks, and walruses and mostly feed on benthic prey such as clams, crustaceans, and demersal fish [5]. Despite their ecological significance, our knowledge about viruses affecting these animals is extremely limited.
Within a recent metagenomic exploration of the virome of bearded seals sampled along the shores of northwest Newfoundland, Canada, we identified a novel, highly divergent paramyxovirus. The family Paramyxoviridae comprises pleomorphic, enveloped viruses with non-segmented, negative-stranded RNA genomes, typically ranging in length from 14 to 20 knt, that infect vertebrates [6]. Below the lipidic membrane, from which the fusion (F) and hemagglutinin-neuraminidase (HN) proteins protrude, is a layer of matrix protein (M) and the ribonucleoprotein complex, formed by the nucleocapsid (viral RNA surrounded by nucleoproteins (N)), together with the polymerase-associated protein or phosphoprotein (P) and the RNA-directed-RNA-polymerase or large (L) protein. The genomes of viruses from almost all genera include, in this order, open reading frames (ORF) coding for N, P, M, F, HN, and L. Some viruses also encode additional putative proteins, such as a non-structural protein (C, within the P ORF and produced after leaky scanning), a cysteine-rich protein (V, within the P ORF and produced after mRNA editing), a small hydrophobic integral membrane protein (SH), and transmembrane proteins (tM). Finally, the genomes of some viruses contain highly conserved short intragenic sequences [6].
There are currently 153 recognized paramyxoviral species, divided between 23 genera and 9 subfamilies [7], infecting mammals, birds, fish, and reptiles. However, new viruses are continuously being discovered through metagenomic studies. A few paramyxoviruses from various genera and subfamilies have been described in marine animals (e.g., in Pacific and Atlantic salmon, spadenose shark, triplecross lizardfish, lined seahorse [7]). However, marine mammals are particularly impacted by members of the genus Morbillivirus, such as canine distemper virus (CDV) phocine distemper virus (PDV) and cetacean morbillivirus (CeMV), which may cause severe diseases characterized by fever, pneumonia, respiratory inflammation, seizures, tremors, discoordination, and immune suppression [8], and are associated with mass mortality events [9,10,11]. Additionally, a novel paramyxovirus belonging to the genus Jeilongvirus was recently detected in Antarctic seals near Brazil [12]. To the best of our knowledge, PDV, CDV, and the novel jeilongvirus are currently the only paramyxoviruses identified in seals. In this study, we report the detection of a novel virus, which we named bearded seal-associated paramyxovirus 1 (BSAPV-1), and describe its genome organization and phylogenetic relationships with other members of the family Paramyxoviridae.
2. Materials and Methods
This study included samples collected from 7 bearded seals (Erignathus barbatus) harvested by licensed hunters in northwest Newfoundland during the spring and summer of 2020 as part of a long-term biological sampling program at the Department of Fisheries and Oceans Canada. From each animal, the trachea and colon were excised and shipped to Memorial University, where their contents were sampled using polyester swabs (Starplex Scientific, Etobicoke, ON, Canada) and submerged together in 3 mL of universal virus transport media (Starswab Multitrans System, Starplex Scientific, Etobicoke, ON, Canada).
2.1. Metagenomics
From each sample, 220 μL was centrifuged at 10,000× g for 10 min, and 190 μL of the supernatant was incubated at 37 °C for 30 min along with 10 μL of DNase I (2 U/μL, New England Biolabs, Ipswich, MA, USA) and 22 μL of 1× DNase reaction buffer. Subsequently, 2.2 μL of 0.5 M ethylenediaminetetraacetic acid (EDTA) was added, and the samples were incubated for 10 min at 75 °C. Nucleic acid extraction was performed using the DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany) according to the manufacturer’s directions for liquid samples, with elution buffer volume reduced to 40 μL during the final step. Subsequently, 15 μL of the isolated nucleic acids (NA) was used for reverse transcription with the ProtoScript^®^ II First Strand cDNA Synthesis Kit (New England Biolabs) with a modified protocol: NA combined with 3 μL of primer mix (random hexamers and oligo dTs) were preheated to 65 °C for 5 min and cooled on ice; a mix containing 20 μL of reaction mix and 3 μL of enzyme mix was added, and the mixture was incubated at 25 °C for 5 min, 42 °C for 60 min, and 80 °C for 5 min. Finally, a second strand synthesis was performed with the NEBNext^®^ Ultra II Non-Directional RNA Second Strand Synthesis Module (New England Biolabs) by adding a mixture containing 10 μL of molecular grade water, 3 μL of Enzyme mix, and 6 μL of 10× buffer, followed by incubation at 16 °C for 2.5 h. Samples were purified using Ampure beads (Beckman Coulter, Brea, CA, USA) at a 1:1 v:v ratio, and purified DNA was eluted in 15 μL of molecular-grade water. To increase DNA concentration, multiple displacement amplification was performed with Phi29 DNA polymerase (10 U/μL, Thermo Fisher Scientific, Waltham, MA, USA). After denaturing at 95 °C for 5 min, 8.6 μL of purified material was mixed with 2 μL Exo-Resistant Random Primer (Thermo Fisher Scientific). After immediate chilling on ice, a mixture of 2 μL buffer, 1 μL enzyme, 5 μL dNTPs (10 mM each), 0.2 μL DTT, and 1.2 μL DMSO was added to each sample, followed by an incubation at 30 °C for 4 h and one at 65 °C for 10 min. Samples were then individually subjected to a final purification step with Ampure beads at a 1:1 v:v ratio. To minimize sequencing costs, samples were equimolarly pooled, and library preparation through tagmentation and Illumina sequencing were outsourced to the Integrated Microbiome Resource, Dalhousie University, Halifax, NS, Canada.
The obtained viral reads were processed using an in-house developed Snakemake workflow (V2D, source available at https://github.com/jtpverhoeven/v2d (accessed on 25 January 2026), which utilizes a collection of third-party tools to automate the characterization of viromes. In short, reads from each sample (assigned to read groups) are quality filtered using fastp (version 0.23.4) [13]. To remove host contamination, reads are subsequently mapped with Bowtie 2 (version 2.5.4) [14] to a reference genome of choice (in this study Phoca vitulina, GenBank accession number GCF_004348235.1) and read pairs with concordant mapping to the reference are discarded. Co-assembly is then performed by piling up all reads from samples on a read-group basis, followed by contig construction using megahit (version 1.2.9), after which the resulting contigs are screened for complete viruses using CheckV (version 1.0.3) and hallmark viral features identified with geNomad (version 1.11.1) [15]. Additionally, deep trawling to detect incomplete or highly divergent viruses is also performed, in which host-depleted reads are assembled on a per-sample basis, after which contigs with high identity to non-viral reference sequences are removed using kraken (version 2.15) and NCBI blast (version 2.16.0, task megablast) [16] in conjunction with the prebuild “Standard plus Refseq protozoa, fungi & plant” index (available at https://benlangmead.github.io/aws-indexes/k2 (accessed on 25 January 2026)) and NCBI core-nt database, respectively. Remaining contigs are then scanned with progressively wider settings using DIAMOND (version 2.1.11 [17], using the NCBI nr database), blastn (using the NCBI nt database), and tblastx (using the NCBI nt database). The workflow then produces an overview of contigs presumed to be of viral origin, and contigs with identity to viral reference sequences are subsequently manually confirmed.
2.2. Virus Screening
To identify samples containing BSAPV-1, a set of primers targeting a 446 nt region of the M gene was created (Paramyxo_F1, GTCATGAGCGCGACATTCAC and Paramyxo_R1, TAGGGTCTCATCGGTTGGAG). Samples that did not test positive were subjected to hemi-nested PCR (Paramyxo_F1 and Paramyxo_R2, AGAGTCCAGAAACGGACTCC, 415 nt). This specific region was also chosen to resolve some ambiguities identified in the consensus sequence after Illumina sequencing. To conduct the PCRs, 12.5 µL of DreamTaq PCR Master Mix (ThermoFisher Scientific, Waltham, MA, USA), 0.5 µL of both primers (10 μM), 11.5 µL of molecular grade water, and 1 µL of enriched dsDNA (2.5 µL of the first round PCR product for the hemi-nested PCR) were mixed and incubated for 3 min at 95 °C, followed by 40 (first PCR) or 25 (hemi-nested PCR) cycles of 30 s at 95 °C, 30 s at 50 °C, 35 s at 72 °C, followed by final 4 min at 72 °C. Obtained amplicons were purified with AMPure XP beads and outsourced for Sanger sequencing.
2.3. Sequence and Phylogenetic Analyses
To verify sequencing coverage and quality, Illumina reads were mapped to the BSAPV-1 genomic sequence identified through V2D using Geneious Prime 2025.1.2 (Dotmatics, Boston, Massachusetts, United States), and a new contig was generated including ambiguities at polymorphic sites. Geneious was also used for ORF annotations, protein sequence predictions, and motif detection. Protein identification was performed using BLASTp with wide settings (word size: 2, gap cost: 9, gap extension: 1) and InterProScan [18]. To establish the phylogenetic relationships between BSAPV-1 and all currently known Paramyxoviridae members, protein sequences encoded by 6 core genes (N, P, M, F, HN, L) of 153 classified Paramyxoviridae species (Supplementary Table S1) and their homologs identified in BSAPV-1 were compared. Sequences of each protein were aligned separately with MAFFT (E-INS-I algorithm) [19], alignments were trimmed with trimAl (version 1.5) [20], and then concatenated into a single alignment. The IQ-TREE 3 ModelFinder function [21] was used on each separate alignment to identify the best model for distance estimation. Models with the lowest Bayesian information criterion (BIC) were identified as the best-fitting ones (Supplementary Table S2). Alignments were then used to generate a maximum-likelihood phylogenetic tree in IQ-TREE 3 [22] using a partition model [23] and ultrafast bootstrap approximation (ufBoot) [24] and SH-aLRT [25] to assess branch robustness.
3. Results
Using de novo assembling within the framework of V2D, the complete coding genome sequence (15,898 nt) of BSAPV-1 was reconstructed from 11,687 reads (average sequencing depth 78.5×) obtained from metagenomic sequencing of a pool of NA originating from 7 bearded seals. After PCR screening of individual dsDNA samples from the pool, only sample SL-13 was found to be positive. As the virus was identified in a pooled oral/fecal sample, it was not possible to determine with certainty whether it was a seal-infecting virus or originated from the seal’s diet.
3.1. Genome Characterization of BSAP-1
The genomic sequence (Figure 1), with a GC content of 51.7%, contained ORFs for N (1221 nt), M (999 nt), F (1671 nt), HN (1944), and L (6360 nt) proteins, arranged according to the pattern typical for other members of the Paramyxoviridae. The genome also included three ORFs whose predicted protein sequences shared no detectable homology with known proteins in GenBank. Specifically, a 1260 nt ORF was located between N and M ORFs, at the place where the P/V/C or V/P gene (depending on the genus) is usually located in viruses within Paramyxoviridae.
Another putative ORF was located at positions 4152–4847, encoding a putative 232 aa-long protein with 53.9% hydrophobic residues and unknown function. Finally, an ORF coding for a putative protein with predicted transmembrane and cytoplasmic domains was located upstream of the F gene. These two ORFs may encode SH and tM proteins, like in viruses of the genus Jeilongvirus, in whose genome, however, these two ORFs are located downstream of F. Sequence analysis revealed the nucleotide AAAACTTAAG motif, which repeats in many intergenic regions (Figure 1). The motif is not, however, fully conserved, as before and after the putative SH, its sequence varies.
3.2. Phylogenetic Analysis
The phylogenetic tree (Figure 2) built with the concatenated alignment of paramyxoviral core proteins (N-M-F-HN-L) distinctly groups each paramyxovirus species into defined genera and subfamilies, aligning fully with the current ICTV classification [6,7]. According to this tree, BSAPV-1 formed a highly supported (bootstrap = 100, SH-aLRT = 100) long-branched monophyletic clade with the only member of the genus Scoliodonvirus (subfamily Skoliovirinae), the Wenzhou Pacific spadenose shark paramyxovirus (species Scoliodonvirus scoliodontis).
According to a BLAST analysis, the closest relative of BSAPV-1 was Wenzhou Pacific spadenose shark paramyxovirus, with an identity of 30.1% at the level of the L protein, which is comparable to L protein identity values among representatives of distinct Paramyxoviridae subfamilies [7]. For the other three major proteins that were identified in both BSAPV-1 and its closest relative, the identity percentages were 20.1% for F, 19.2% for N, and 28.9% for HN, indicating substantial divergence.
4. Discussion
The complete coding genome sequence of a novel paramyxovirus, which we named BSAPV-1, was discovered in trachea and colon swabs of a bearded seal from the northwest coast of Newfoundland, Canada. The sequenced genome is almost complete and has a length of 15,898 nt with a GC content of ~52%. The closest relative of BSPAV-1 is Wenzhou Pacific spadenose shark paramyxovirus, discovered in the Pacific spadenose shark (Scoliodon macrorhynchos) [26].
BSAPV-1 genome organization resembles that of other members of this viral family and contains genes encoding five core proteins (N, M, F, HN, L) along with three putative proteins with no recognizable homologue in the NCBI Gene Bank database. The first one is located where the P/V or P/V/C gene (depending on the genus) is usually located and has a size of 1260 nt, which is similar to the sizes of P/V genes in other paramyxoviruses, ranging from 1300 to 1500 nt [6]. This indicates that this ORF could encode the P/V proteins, although this will have to be confirmed in future studies. The other two ORFs could correspond to the SH and tM proteins, like in other paramyxoviruses (e.g., members of Jeilongvirus). Protein structure predictions may be useful in clarifying the function of these predicted proteins.
An important element in determining the taxonomic affiliation of a virus to different genera of the Paramyxoviridae family is the identification of highly conserved motifs in intergenic regions [6]. Thus, for most paramyxoviruses belonging to the subfamilies Orthoparamyxovirinae and Metaparamyxovirinae, the repetition of the CTT motif in the regions between coding sequences is characteristic [6]. In turn, representatives of the Rubulavirinae and Avulavirinae subfamilies have intergenic regions of variable length with no traceable conserved repetitive motifs [6]. On the other hand, Wenzhou Pacific spadenose shark paramyxovirus has a highly conserved AAAAACTT motif in all its intergenic regions [26]. Analyzing the intergenic regions of BSAPV-1, which vary significantly in length (87–201 nt), the conserved AAAACTTAAG motif was found, highlighting once again the somewhat close relationship between these two viruses.
Within a family-wide phylogenetic analysis including all 153 classified Paramyxoviridae species, BSAPV-1 was located on a long branch separating it from the only member of the Skoliovirinae, its closest relative also in terms of sequence identity (30.1% within the L protein). While the current virus species demarcation threshold established by the ICTV is 85% pairwise identity between L proteins [6,7], there are no established demarcation criteria for genera and subfamilies. Considering the results of the phylogenetic analysis as well as the high divergence of BSAPV-1 from its closest relative, we concluded that BSAPV-1 is likely the first discovered member of a new paramyxoviral subfamily. In fact, similar distances can be observed between the L proteins of other paramyxoviruses from different subfamilies. For example, the identity between the L proteins of Wenzhou Pacific spadenose shark paramyxovirus (Skoliovirinae) and of its closest classified relative in the Parajeilongvirus (Orthoparamyxovirinae) is 32.2%, and the identity between the L proteins of the only classified member of the Glossavirinae (Wenling tonguesole paramyxovirus) and the only classified member of the Metaparamyxovirinae (Wenling triplecross lizardfish paramyxovirus) is 30.3% [7].
As the virus was detected in a tracheal/rectal swab of only one animal, and its closest relative was detected in the Pacific spadenose shark [26], we could not clearly establish the host of the detected virus and hypothesize that the virus originated from prey consumed by the seals, rather than being a seal virus. To test this hypothesis, future research should incorporate the screening of a larger number of seals and of prey species that are part of the seal diet, including benthic organisms. Since the virus was identified in bearded seals, whose range extends to Arctic and subarctic regions, future screening efforts should also focus on other coasts, such as the Canadian Archipelago, Alaska, Russia, Japan, Greenland, or the archipelago of Svalbard. Finally, this study highlights how metagenomic investigations are still crucial for expanding our understanding of virus diversity in understudied species.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Gryba R. Huntington H.P. Von Duyke A.L. Adams B. Frantz B. Gatten J. Harcharek Q. Olemaun H. Sarren R. Skin J. Indigenous knowledge of bearded seal (Erignathus barbatus), ringed seal (Pusa hispida), and spotted seal (Phoca largha) behaviour and habitat use near Utqiaġvik, Alaska, USA Arct. Sci.202177832785810.1139/as-2020-0052 · doi ↗
- 2Kingsley M.C.S. Stirling I. Haul-out behaviour of ringed and bearded seals in relation to defence against surface predators Can. J. Zool.1991691857186110.1139/z 91-257 · doi ↗
- 3Laidre K.L. Stirling I. Lowry L.F. WiigØ. Heide-Jørgensen M.P. Ferguson S.H. Quantifying the sensitivity of Arctic marine mammals to climate-induced habitat change Ecol. Appl.200818 S 97S 12510.1890/06-0546.118494365 · doi ↗ · pubmed ↗
- 4Womble J.N. Gende S.M. Post-breeding season migrations of a top predator, the harbour seal (Phoca vitulina richardii), from a marine protected area in Alaska P Lo S ONE 20138 e 5538610.1371/journal.pone.005538623457468 PMC 3573017 · doi ↗ · pubmed ↗
- 5Kovacs K.M. Erignathus barbatus The IUCN Red List of Threatened Species IUCN Gland, Switzerland 2025 e.T 8010 A 27917590810.2305/IUCN.UK.2025-2.RLTS.T 8010 A 279175908.en · doi ↗
- 6Rima B. Balkema-Buschmann A. Dundon W.G. Duprex P. Easton A. Fouchier R. Kurath G. Lamb R. Lee B. Rota P. ICTV Report Consortium. ICTV virus taxonomy profile: Paramyxoviridae J. Gen. Virol.20191001593159410.1099/jgv.0.00132831609197 PMC 7273325 · doi ↗ · pubmed ↗
- 7Simmonds P. Adriaenssens E.M. Lefkowitz E.J. Oksanen H.M. Siddell S.G. Zerbini F.M. Alfenas-Zerbini P. Aylward F.O. Dempsey D.M. Dutilh B.E. Changes to virus taxonomy and the ICTV Statutes ratified by the International Committee on Taxonomy of Viruses Arch. Virol.202416923610.1007/s 00705-024-06143-y 39488803 PMC 11532311 · doi ↗ · pubmed ↗
- 8Bossart G.D. Duignan P.J. Emerging viruses in marine mammals CAB Rev.20181311710.1079/PAVSNNR 201813052 · doi ↗
