Molecular Characterization of Complete Simian Foamy Virus Genomes from Three Colobine Monkeys Reveals Highly Divergent Evolutionary Trajectories and Identifies Transmission to Humans
Anupama Shankar, Haoqiang Zheng, David Cowan, Hongwei Jia, Gunars Osis, Alex Burgin, Mili Sheth, Nicole A. Hoff, Megan Halbrook, Anne W. Rimoin, Tony L. Goldberg, Colin A. Chapman, Nelson Ting, William M. Switzer

TL;DR
This study characterizes new simian foamy virus genomes from three monkey species and finds evidence of distinct viral evolution and human transmission.
Contribution
The study reveals divergent evolutionary paths of SFVs in colobine monkeys and identifies new human infections.
Findings
New SFV genomes from Trachypithecus francoisi, Pygathrix nemaeus, and Colobus guereza were characterized.
A Δtas mutation in SFVpne may promote viral latency.
Four new human infections with Cgu-derived SFV were identified in the Democratic Republic of Congo.
Abstract
Simian foamy viruses (SFVs) are ancient retroviruses that co-evolve with nonhuman primates (NHPs), although genomic data from Asian and African monkeys are limited. We report the characterization of three new SFV colobine genomes from two Asian species (Trachypithecus francoisi (Tfr) and Pygathrix nemaeus (Pne)) and one African monkey (Colobus guereza, Cgu), obtained via metagenomics analysis of peripheral blood leukocyte tissue culture isolates. Genomic analyses found conserved structural, enzymatic, and auxiliary genes flanked by long terminal repeats, with all major transcriptional and structural motifs highly preserved. An in-frame Δtas mutation in tissue culture and ex vivo specimens was identified in the SFVpne genome, which may promote viral latency. Phylogenetic analyses revealed that these colobine SFVs have distinct evolutionary trajectories without clustering together,…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2- —Canada Research Chairs Program (Colin Chapman)
- —NIH
- —National Institutes of Health
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsHIV Research and Treatment · Chromosomal and Genetic Variations · Polyomavirus and related diseases
1. Introduction
Simian foamy viruses (SFVs) are complex retroviruses in the family Retroviridae, subfamily Spumaretrovirinae, and genus Simiispumavirus, infecting a wide range of nonhuman primates (NHPs) across Asia, Africa, and Latin America [1,2,3,4,5,6,7,8,9,10,11,12,13,14,15]. SFVs exhibit genetic stability with very low evolutionary rates, having co-evolved with their NHP hosts for millions of years, resulting in species-specific lineages. SFVs establish lifelong persistent infections, often without causing disease, likely because of adaptation during their co-evolutionary history [14]. SFV infection is characterized by persistent seropositivity and molecular detection in peripheral blood lymphocytes (PBLs) and other body sites, including the oral cavity, where viral replication occurs [14]. Transmission of SFV typically occurs through saliva during aggressive encounters, such as bites and scratches, which occur during primate social interactions [14].
Although no human-specific SFV has been identified, zoonotic transmission frequently occurs through contact with NHPs during hunting for bushmeat, biomedical research, or in zoological collections [16,17,18]. In such cases, phylogenetic analyses have exploited the co-evolutionary history of SFVs to identify the NHP source of infection [17,18,19]. Human SFV infections have originated from a range of Old-World NHPs, including chimpanzees, gorillas, baboons, mandrills, African green monkeys, and various macaque species [13,17,18,19,20,21,22,23]. Like the simian hosts, SFV infections in humans appear nonpathogenic, although this conclusion is biased by studying healthy populations with limited clinical follow-up of SFV-infected individuals [24,25]. For example, a recent cross-sectional case–control study of asymptomatic male hunters from Cameroon infected with gorilla SFV reported anemia and hematological abnormalities, though the clinical significance remains unclear [25]. As with other retroviruses, such as simian and human immunodeficiency viruses (SIV and HIV) and T-cell lymphotropic viruses, disease may take decades to develop [26,27,28,29]. Hence, SFV-associated diseases may be rare, cryptic, or restricted to infections with specific viral variants [18].
Research on SFV evolutionary relationships in Old-World monkeys has primarily focused on African species, with limited representation from Asia, mainly involving Macaca species due to their prevalence in zoo collections, research studies, and interactions with humans near temples [4,30,31,32,33,34]. The Colobinae subfamily, known as colobines or leaf-eating monkeys, is a diverse group of Old-World primates found across equatorial Africa (Colobini tribe) and Asia (Presbytini tribe), having diverged from the Cercopithecinae approximately 13.8 million years ago (MYA) [35,36]. Although the taxonomy of specific Colobinae clades remains debated, this subfamily includes the African genera Colobus (black and white colobus), Procolobus (olive colobus), Piliocolobus (red colobus), the Asian genera Presbytis (surilis), Trachypithecus (lutung, langur, or leaf monkey), Semnopithecus (gray langur), Rhinopithecus (snub-nosed monkey), Pygathrix (douc langur), Nasalis (proboscis monkey), and Simias (pig-tailed langur). Asia has 57 colobine species compared to 23 in Africa [35].
Previously, we demonstrated that SFVs from captive Colobus guereza and wild Piliocolobus rufomitratus tephrosceles from Uganda and P. badius badius from Côte d'Ivoire clustered phylogenetically within the Cercopithecinae as sister taxa to the Macaca SFV by using short integrase (IN) sequences obtained by generic PCR-amplification of PBL DNA [3]. While these results indicate genetic relatedness of the African Colobinae SFV, the co-evolutionary hypothesis suggests these colobus SFVs should cluster outside the SFVs of the Cercopithicinae. In contrast, only one SFV pol sequence has been reported from the Asian colobine Trachypithecus francoisi (Francois’ langur), which formed a distinct and highly divergent phylogenetic lineage between the ape and Cercopithicinae SFVs, consistent with the co-evolutionary hypothesis for the Colobinae hosts and SFVs [37]. The T. francoisi SFV sequence was obtained with generic PCR amplification from tissue culture cells, as all SFV western blot (WB)-positive Francois’ langurs and Pygathrix nemaeus (red-shanked douc langur) PBL DNA samples in that study were PCR-negative, further highlighting the high divergence of Asian colobine SFVs [37].
In the Democratic Republic of Congo (DRC), we demonstrated human infection with a novel colobus SFV from C. angolensis (Angolan colobus) in two of sixteen WB-positive women [13]. Among eleven WB-positive persons in the study with available PBL DNA, ten reported NHP exposure, including eight with contact with C. angolensis, but were negative for SFV using generic PCR assays [13]. Similarly, in Asia, individuals are exposed to NHPs, including langurs and macaques, through activities related to deforestation, agricultural expansion, hunting, and when sharing urban settings like parks, religious sites, animal markets, and zoos [4,38,39]. Two studies reported SFV infections from Macaca species in four persons in Asian countries who lived or worked near NHPs [38,39]. In one of these studies, five persons were WB-positive but PCR-negative using a generic SFV pol assay [39]. These negative PCR results in both Africa and Asia likely reflect viral loads below the detection threshold of the molecular assays or infections with divergent SFVs not readily identified with the currently used PCR methods. While PCR negativity alone does not demonstrate viral divergence, this pattern is consistent with substantial primer–template mismatch caused by sequence divergence, particularly given the successful amplification of SFV sequences from tissue culture isolates derived from the same animals. The generic pol PCR assays used in these studies were designed based on complete SFV genomes available at the time, including chimpanzee, gorilla, macaque, and African green monkey. While effective for detecting various SFVs from multiple species, these assays may not identify highly divergent SFVs, as demonstrated with gibbon SFV [40].
A database containing sequences from divergent SFV lineages is crucial for the molecular surveillance of SFVs in NHPs and zoonotically infected humans. Currently, the only complete SFV genomes from Asian NHPs are from an orangutan (Pongo pygmaeuspygmaeus, SFVppy_bella, GenBank # AJ544579), a pileated gibbon (SFVhpi_SAM106, Hylobates pileatus, GenBank # MF621235), and five macaques (SFVmcy _FV21, M. cyclopis, GenBank # NC_010819; SFVmcy _FV34[RF], M. cyclopis, GenBank # KF026286; SFVmfa_Cy5061, M. fascicularis, GenBank # KF026286; SFVmfu_WK1.pJM356, M. fuscata, GenBank # AB92351; and SFVmmu_K3T, M. mulatta, GenBank # MF280817). No complete SFV genomes from Asian or African Colobinae exist, despite their significant taxonomic diversity and wide geographic distribution. To address this gap and improve diagnostic assays for detecting Colobinae SFV in humans, we obtained and characterized complete SFV genomes from one African (Colobus guereza, Cgu) and two Asian species (Trachypithecus francoisi (Tfr) and Pygathrix nemaeus (Pne)). We used these new SFV genomes to investigate their evolutionary histories through detailed phylogenetic analyses, design generic PCR assays for their detection, and apply these new assays on PBL DNA samples from SFV-seropositive persons in the DRC with NHP exposure.
2. Methods
Blood Sample Processing, Serology, Co-Culture, and PCR Detection. EDTA-treated whole blood specimens were collected from captive colobine monkeys at North American zoological gardens during their annual examinations, following the guidelines of animal care and use committees at each institution. PBLs and plasma were obtained by Ficoll-Hypaque centrifugation of whole blood, and DNA lysates were prepared as described previously [37]. Human DNA specimens from persons in the DRC were extracted from buffy coats using the Flexigene DNA extraction kit (Qiagen) and quantified with a Nanodrop instrument [11]. The UCLA Institutional Review Board approved the collection, storage, and future testing of blood samples collected in 2007 from all consenting study participants as previously described [11]. This study (Protocol #07041) was reviewed by the Centers for Disease Control (CDC), deemed research not involving human subjects, and conducted consistently with applicable federal law and CDC policy using anonymized participant specimens and information. Demographic and animal contact information for DRC participants was collected with study questionnaires.
SFV serology was performed on plasma using a combination of validated EIA and western blot (WB) assays that broadly detect SFV in Old-World monkeys and apes [30,37]. SFV was isolated from three seropositive colobines (Cgu_910916, Pne_500057, and Tfr_083616) by co-culturing peripheral blood lymphocytes (PBLs) with canine thymocyte (Cf2Th) cells as previously described [37]. In brief, cryopreserved PBLs were rapidly thawed and maintained in interleukin-2-supplemented medium at 37 °C for 72 h. Following stimulation, cells were washed and combined with Cf2Th cells at a 1:1 ratio. The co-cultures were examined at 3–4 day intervals for the development of syncytial cytopathic effects (CPE) characteristic of SFV infection. Upon confirmation of CPE, Cf2Th cells and corresponding culture supernatants were harvested and preserved in liquid nitrogen.
The integrity of PBL and buffy coat DNA was confirmed by β-actin PCR [11]. Initial SFV PCR testing was conducted using 1.0 µg of PBL or 100 ng of tissue culture DNA in a generic nested pol PCR assay designed to amplify a 465 bp IN region that has been used successfully to detect diverse SFVs [12,37]. This assay amplified integrase (IN) sequences from Cgu_910916 PBL and Tfr_083616 tissue culture DNA, but not from Pne_500057 [37,41]. We then developed a new nested PCR assay in IN based on a multiple sequence alignment, which included newly acquired complete genomes from the three colobine monkeys and those from 28 other Old-World monkeys and apes. The IN region was selected because it allows broad amplification of diverse SFVs while providing sufficient phylogenetic resolution to distinguish host-associated lineages and is widely used for comparative evolutionary analyses of foamy viruses. The first round PCR used 200 ng DNA with primers SFV1M 5′-GAY AAR CTT GCC ACC CAA GG-3′ and FVPGR1M 5′-CCT GYA RAA GAG ANA RYT CYT CTT CTC-3′ with 3.4 units Expand High Fidelity Taq polymerase (Roche, Indianapolis, IN) per reaction under the conditions of 95 °C for 30 s, 53 °C for 30 s, and 72 °C for 2 min for 40 cycles. The 2nd round PCR used primers SIF3M 5′-CCA ARC CTG GAT GCA GAG YTG GAT C-3′ and FVPGR2M 5′-TCT TCT CKN GWY AAR TCA AGT GT-3′ with 2.5 units AmpliTaq (Thermo Fisher Scientific, Waltham, MA, USA) per reaction under the conditions of 95 °C for 30 s, 50 °C for 30 s, and 72 °C for 2 min for 40 cycles. The primary and nested PCR product sizes were 950 bp and 854 bp, respectively. The sensitivity of the new PCR assay was estimated using 10-fold titrations of nucleic acids from PBL tissue culture supernatants of SFVcgu_910916, SFVpne_500057, and SFVtfr_083616.
Previous studies indicated that SFV adaptation to tissue culture can cause deletions in the transcriptional transactivator (tas) gene, eliminating Tas production and inducing transcription of the Bet (between envelope (env) and tas) protein [42,43,44,45]. These tas deletion mutants are referred to as Δtas. Overproduction of Bet in the absence of Tas downregulates viral replication, favoring viral persistence and latency [46,47]. We assessed intact and Δtas variants using a new nested PCR assay using primers designed from the SFVpne_500057 genome, employing standard PCR conditions except for a 50 °C annealing temperature and 40 amplification cycles. The outer and inner PCR primers are 10100F 5′-CCA TCA ACA GTA ACC TGG CAC-3′ and 10590R 5′-TCT TGG TAG CGC CGC TTC CTA-3′, and 10165F 5′-GAG AGA TTG GGT ACC TGA TCC-3′ and 10590R 5′-GAG CGA CGT TTT GGG AGT CGA G-3′, respectively.
Next Generation Sequencing and SFV Genome Assembly. To obtain complete genomes from viral isolates, we used a metagenomics approach described in detail elsewhere [40]. Briefly, we centrifuged 0.5–1.0 mL of tissue culture supernatant at 43,000 rpm at 4 °C for 30 min and resuspended the viral pellet in 165 μL of supernatant. The sample was treated with a cocktail of DNase enzymes (Turbo DNase, Ambion, Austin, TX, USA; Baseline-ZERO DNase, Biosearch Technologies, Middlesex, UK). Viral nucleic acids were extracted with the QIAamp MinElute Virus Spin Kit (Qiagen, Hilden, Germany) following the manufacturer’s protocol, without the addition of carrier RNA. Purified RNA was reverse-transcribed using random hexamer primers to generate double-stranded cDNA with the SuperScript Double-Stranded cDNA Synthesis Kit (Invitrogen, Carlsbad, CA, USA). The resulting cDNA was cleaned using Agencourt AMPure XP beads (Beckman Coulter, Brea, CA, USA). Approximately 1 ng of purified cDNA was then subjected to concurrent fragmentation and adapter tagging using the Illumina DNA Prep Kit (Illumina, San Diego, CA, USA). Libraries were sequenced on an Illumina MiSeq platform with the MiSeq Reagent Kit v2 (500 cycles), generating paired-end reads in a 2 × 150 bp format.
Illumina’s native demultiplexing software (bcl2fastq v2.20.0.422) was utilized for initial read processing, giving an average read Q score of 33.8. Data was then de-hosted by excluding reads aligned to the dog reference genome (Illumina iGenomes, Canis familiaris NCBI build 3.1) because the isolates were grown in Cf2Th cells, using Bowtie2 v2.5.1 [48]. We processed the remaining reads with the nf-core/viralrecon pipeline (v2.6.0, nf-core/viralrecon: nf-core/viralrecon v2.6.0—Rhodium Raccoon) in de novo assembly mode, conducting quality control (FastQC), genome assembly using SPAdes [49], Unicycler [50], and Minia [51], and generated an assembly report with QUAST [52]. We analyzed the resulting assemblies with BLASTN to evaluate the content, coverage, and length [53]. Contigs of the appropriate size representing SFV genomes were selected for further analysis, yielding an average mean mapping quality of 38.2 and a mean depth of 24,052 for all contigs.
Sequence Analysis of Complete SFV Genomes. We used the DNA-to-protein translation website (http://insilico.ehu.es/translate/; accessed on 31 March 2024, 31 May 2024) to identify protein-coding reading frames in the sense direction of the SFVcgu_910916, SFVpne_500057, and SFVtfr_0836161 genomes.
The boundaries of the complete 5′ and 3′ long terminal repeats (LTRs) were defined by manual inspection, guided by alignment with previously published SFV reference genomes. Putative splice donor and acceptor sites were predicted using the neural network–based NetGene2 platform (http://www.cbs.dtu.dk/services/NetGene2/; accessed on 31 March 2024, 31 May 2024). Candidate nuclear localization signals (NLS) within the Tas protein were assessed using both NucPred (https://nucpred.bioinfo.se/cgi-bin/single.cgi; accessed on 31 March 2024, 31 May 2024) and PSORTII (https://psort.hgc.jp/form2.html; accessed on 31 March 2024, 31 May 2024).
The five principal open reading frames—gag, pol, env, tas, and bet—were identified and extracted in Geneious v2025.0.3 and aligned against representative monkey SFV genomes with complete sequences. For phylogenetic analyses, a concatenated sequence comprising the three major structural and enzymatic genes (gag, pol, and env) was constructed to enhance analytical robustness.
Codon-aware nucleotide alignments of both the concatenated dataset and individual pol sequences were generated with MAFFT v7.0.26 [54], followed by manual curation and removal of gap-containing regions. The optimal nucleotide substitution model was selected using the model-testing function in MEGA v7.0.26, which identified the general time reversible model with gamma-distributed rate variation and a proportion of invariable sites (GTR + G + I) as the best fit. Phylogenetic signal was evaluated through likelihood mapping of quartet topologies in IQ-TREE v1.6.12 [55]. In addition, overall phylogenetic signal from the alignments, as well as substitution saturation, were examined using DAMBE v7.0.35 (http://dambe.bio.uottawa.ca/DAMBE/dambe.aspx, 26 February 2026), identifying saturation at the 3rd codon position in the alignment. Consequently, we used the 1st and 2nd codon positions of the alignment for the concatemer phylogenies as previously described [40].
We inferred the colobine concatemer sequence phylogeny using Bayesian inference with BEAST v.1.8.4, using a birth-death speciation tree prior and the 1st and 2nd codon positions of the concatemer alignment, as we have shown that the 3rd codon position is saturated for phylogenetic analysis [40]. We included SFVs from 19 other simians and three non-simians (feline, bovine, and equine) to explore the full FV phylogeny using 400 million Markov Chain Monte Carlo (MCMC) iterations with a 10% burn-in. We used primate and non-primate fossil and genomic divergence dates to calibrate the relaxed molecular clock as normal or exponential tree priors (Table S1) [36,56,57]. Trees were logged every 40,000 generations, and two independent BEAST runs were performed to verify convergence and reliability of the results. We used Tracer v1.7.2 to confirm convergence, ensuring effective sampling size (ESS) values >250. TreeAnnotator v1.8.4 was applied to select the maximum clade credibility tree from the posterior distribution of 10,001 sampled trees, with a burn-in value of 1000 trees. The pol phylogeny was constructed using the approximate ML method with FastTree v2.2.0 (https://morgannprice.github.io/fasttree/, 26 February 2026) and the GTR nucleotide substitution model with clades defined by Shimodaira–Hasegawa (SH) support values ≥0.7. Representative SFV pol sequences across primate taxonomy available at GenBank were included for the pol phylogenetic analysis. The inferred concatemer and pol trees were visualized using FigTree v1.4.4 (http://tree.bio.ed.ac.uk/software/figtree/, 26 February 2026).
Genetic recombination in the concatemer alignment was evaluated using the Recombination Detection Program (RDP) v4.101 using the RDP, GENECONV, MaxChi, Chimaera, Bootscan, 3Seq, and SiScan algorithms [58].
GenBank Accession Numbers. The complete genomes of SFVcgu_910916, SFVpne_500057, and SFVtfr_083616 were assigned accession numbers PP966955, PP966956, and PP966957, respectively. The human and monkey SFV pol sequences have accession numbers PV939864-PV939903.
Statistical Analysis. We assessed the statistical significance of the proportion of SFV infections detected by the original and new pol PCR assays using a two-tailed Z score test at a significance level of 0.01 at https://www.socscistatistics.com/tests/ztest/default2.aspx; accessed on 31 November 2025.
3. Results
SFV genome assembly. We obtained the complete SFV genomes of SFVcgu_910916, SFVpne_500057, and SFVtfr_083616 using a metagenomics approach with tissue isolate supernatants when ≥50% CPE was observed starting on day 18 of culturing. The three novel SFV genomes were composed of 150,000 to 4.4 million reads, resulting in a mean depth coverage ranging from 1783 to 60,529. The sequences of the complete genomes were determined by manual alignment of overlapping 5′ and 3′ LTR regions to give final lengths of 12,645 bp, 12,649 bp, and 12,644 bp for SFVcgu_910916, SFVpne_500057, and SFVtfr_083616, respectively (Table 1).
Genome comparisons of the new SFVs with other related Asian and African monkey SFVs identify distinct Colobinae SFV evolutionary histories. Gene length comparisons of the new SFVs with other monkey SFVs are provided in Table 1. The new SFV genome lengths were closer in length to those from an Asian Macaca cyclopis SFV but shorter than those from African monkeys. The lengths of the LTRs and the five coding regions for the three new SFVs were comparable in length to those from African and Asian monkey SFVs. Notably, the SFVpne_500057 LTR and five coding region lengths were closer (LTR and gag) or identical (pol, env, tas, bet) to those from the Macaca cyclopis SFV.
In Table 2, we provide the nucleotide and amino acid identity comparisons of the major genes and proteins, respectively, of the three new SFVs with those from related Asian and African monkey SFVs, and for the LTRs and complete genome nucleotide sequences. Sequence analysis showed that SFVtfr_083616 was nearly equidistant from other monkey SFVs sharing approximately 60% nucleotide identity across the genome. In contrast, SFVcgu_910916 and SFVpne_500057 were closer genetically to SFVcae_LK3 and SFVpan_V909 from African Chlorocebus and baboon monkeys, respectively. As expected, the highest identities were seen in the pol and env genes, and the lowest were seen in the LTR, tas, and bet regions.
Bayesian phylogenetic analysis of gag-pol-env concatemers inferred divergent evolutionary trajectories for SFVcgu_910916, SFVpne_500057, and SFVtfr_083616, which were polyphyletic instead of forming a Colobinae clade as would be expected under a strict host co-evolution scenario (Figure 1). Only SFVtfr_083616 mirrored the host phylogeny, appearing basal to the African and Asian (Cercopithicinae) monkey SFVs. In contrast, SFVcgu_910916 was basal to the African monkey (Cercopithicini) SFV, while SFVpne_500057 clustered within the macaque SFV clade, likely due to ancestral host switching. Of the remaining FV genomes, only the New-World monkey SFVssc (Saimiri sciureus) did not conform to the host co-evolution model, confirming results from previous studies [16,59]. Divergence dating showed that the time to most recent common ancestor (TMRCA) for the Asian SFVtfr_083636 genome was the oldest among Colobinae SFVs (19.12 million years ago (mya)), followed by the African SFVcgu_910916 genome (10.78 mya), with the Asian SFVpne_500057 genome TMRCA evolving more recently within the radiation of the Macaca (2.07 mya) (Table 3). Only the SFVcgu_910916 TMRCA aligned with the estimated ancestral colobine divergence of 7.44–16.68 mya; TMRCAs for SFVpne_500057 and SFVtfr_083616 were younger and older, respectively (Table 3). We did not find evidence of recombination in the colobine gag-pol-env concatemers using RDP, reducing the likelihood that recombination accounts for the observed gene-tree and TMRCA discordances. Combined, our results suggest that Colobinae SFV evolution has not strictly mirrored host phylogeny and that host switching appears to have influenced SFV phylogeny.
Conservation of important functional domains in the major coding regions of the new SFVs when compared with related Asian and African monkey SFVs. An alignment of the complete genomes of SFVcgu_910916, SFVpne_500057, and SFVtfr_083616, provided in Figure S1, shows the location and conservation of the poly adenylation (poly A) signal and TATA boxes in the LTRs. The primer binding site and dimerization signals in the pre-gag region are highly conserved, as are the 3′ polypurine tract (PPT) in pol and the central PPT preceding the 3′ LTR. The internal promoter in env is somewhat conserved. The splice donor (SD) signal in the tas/bet coding region is highly conserved, but the inferred splice acceptor (SA) signal for SFVtfr_500057 is distinct and shifted upstream in the alignment by five nucleotides.
Figure S2 shows an alignment of the SFVcgu_910916, SFVpne_500057, and SFVtfr_083616 Gag residues with those from five macaques and five Cercopithecinae SFVs. The PSAP motif important for viral budding; the arginine (R) residue within the N-terminal cytoplasmic targeting and retention signal (CTRS), and the YXXL motif required for particle assembly were completely conserved in all 13 SFVs. We also identified the three glycine-rich (GR) boxes in the C terminus of the Gag protein important for nucleic acid binding and packaging, reverse transcription, and capsid assembly. Within GR2 is the relatively conserved chromatin-binding sequence (CBS) used by the SFV Gag protein to bind to chromosomes for nuclear accumulation in newly infected cells, contributing to integration site preference [61].
Within Pol, we observed that the reverse transcriptase (RT) active center (YVDD), the RNase H residues (DSF), and the integrase (IN) zinc binding motif (BM) with residues HHCC were completely conserved (Figure S3). The IN active center (AC) was mostly conserved with residues DDE, except for SFVpan_V909-03F_MK241969, which has residues NDE. The RT-IN cleavage site (CS) YVVN/XNXX was partially conserved, with SFVpne_500057 and macaque SFV having YVVH/XNXX and SFVtfr_083616 having YVMN/XVXX.
For Env, the WXXW motif in the N-terminal cytoplasmic domain of the leader peptide subunit, important for capsid interaction, is evolutionarily conserved (Figure S4) [62]. The fusion peptide, membrane-spanning domain (MSD), and endoplasmic reticulum retrieval signal are relatively conserved. The optimal furin cleavage sites (RX[K/R]R) at the surface protein and transmembrane junction are conserved, but at the leader peptide and surface protein location, the residues are RXXR.
The transcriptional transactivator (Tas) protein of SFVs contains a bipartite nuclear localization signal (NLS) that directs Tas into the nucleus of cells, where it acts upon promoter sequences in the U3 region of the LTR to initiate transcription of the gag, pol, and env genes and in the internal promoter in env to facilitate transcription of the regulatory genes tas and bet [63]. Bioinformatic analysis identified the potential NLS for each SFV in the Tas alignment (Figure S5) and confirmed the high genetic heterogeneity in the NLS in Tas. Sequence analysis showed that the NLS of SFVpne_500057 shared more identity with those from Macaca SFV (64.7–76.5%) than those from African monkey SFV, including SFVcgu_910916 (35.3–41.2%). Deletions in tas have been reported for SFV tissue culture isolates [42,43,45,46]. Of the three new colobine SFV genomes, only SFVpne_500057 contained an in-frame Δtas variant of 294 bp in length in the assembled NGS genome (Figure S5). The position of Δtas occurs at the exact locations of the splice donor and acceptor junctions for generation of the bet coding region, such that Tas transcription is eliminated and only the Bet reading frame is preserved. We confirmed that the Δtas and intact tas genes were present in both the PBL tissue culture supernatants and PBLs from Pne_500057 with nested PCR from two collection dates (23 May 2000 and 7 November 2000). These results suggest the Δtas variant was likely selected in vitro and represented the majority variant in tissue culture.
While Bet has been shown to downregulate viral transcription, little is known about the regulatory motifs within Bet that are responsible for this functionality (Figure S6) [43,64]. One study has shown that the K/RGD motif in the C-terminus of Bet may be involved in binding integrins, which is known for its role in virus entry and infection [65]. Inspection of the Tas alignment identified a conserved XGD motif in SFVpne_500057, the macaque SFVs, and one baboon SFV (SFVpan_V909-03F), but only the D residue was present in the remaining monkey SFV Tas proteins (Figure S5). While one study reported the presence of an NLS in the C-terminus of a Tas protein from a human infected with a chimpanzee SFV, we did not find an NLS in any of the Asian or African monkey SFVs using the tool PSORTII [64].
Distribution and phylogenetic characterization of novel colobine SFVs in wild and captive NHPs. We developed and applied a new generic PCR assay for the detection of IN sequences by using the new SFV genomes obtained in our study. By using 10-fold titrations of nucleic acids recovered from PBL tissue culture supernatants of Cgu_910916, Pne_500057, and Tfr_083616, we found the new PCR assay was very sensitive and could detect 0.4 copies/reaction, 1.4 copies/reaction, and 0.6 copies/reaction, respectively. Upon testing of PBL DNA from 75 animals across seven Colobinae genera, of which 44% (33/75) were seropositive, we detected SFV sequences in 90.1% (30/33) of the animals (Table 4). We detected SFV sequences for the first time in two additional Trachypithecus species (T. cristatus cristatus and T. obscurus) by using the new PCR test. In contrast, only 27.3% (9/33) of the seropositive animals were confirmed with SFV infection by using the original PCR assay. The difference in proportions of SFV PCR-positive animals for each test was significant (p < 0.00001) using a two-tailed Z score test. None of the seronegative monkeys tested positive by using both the new and original SFV assays, demonstrating the high specificity of both PCR tests.
Phylogenetic analysis revealed that, like the gag-pol-env concatemers from the three new SFV genomes, all Trachypithecus IN sequences clustered together basal to Cercopithecinae SFVs (Figure 2). Within the Trachypithecus clade, the SFV IN sequences from T. cristatus cristatus (Tcr) and T. obscurus (Tob) formed a sister clade to T. francoisi (Tfr) with high support (SH = 0.97), indicating cospeciation. The African colobine SFV (Cgu, Can, Cas) formed a single clade basal to the African Cercopithecinae SFVs, consistent with the gag-pol-env concatemer analysis, and included species-specific subclades for colobus and procolobus (Pba, Pte) SFVs. All P. nemaeus SFVs formed a clade basal to Macaca SFVs with strong support (SH = 1), rather than clustering within Macaca SFVs in the gag-pol-env analysis. Notably, one C. angolensis (Can_593363) IN sequence clustered with captive Allenopithecus SFVs and an IN sequence from a wild-borne Cercopithecus mona monkey from DRC. Can_593363 was housed with Allenopithecus monkeys in a zoological garden, and in the wild, the distributions of C. mona and Allenopithecus monkeys overlap. Both scenarios likely explain the observed cross-species SFV infections and phylogenetic clustering of their SFV sequences.
Detection of Colobinae SFV in persons exposed to NHPs in DRC. Next, we used the new SFV PCR assay to re-test buffy coat DNA from participants in DRC who were SFV WB seropositive or seroindeterminate in a previous study (Table 5) [13]. Persons with seroindeterminate WB results showed reactivity to only a single Gag protein. In the former study, we only detected SFV sequences in three women, of which two originated from C. angolensis and one from C. ascanius, by using the original PCR test [13]. With the new PCR assay, we detected SFV sequences in an additional four people (three men, one woman) who were all infected with C. guereza SFV as determined by phylogenetic analysis (Table 5, Figure 2). Two of these four people had seropositive results, while the other two were seroindeterminate (Table 5). We also confirmed SFV infection and the simian origin of the SFV in the first three women identified in the original study (Table 5, Figure 2). Five of these seven SFV-infected people reported various simian exposures, while the other two persons, one man and one woman, did not report specific primate exposures but frequently visited the forests surrounding their villages (Table 5). Nonetheless, we previously demonstrated that people with frequent forest visits were at increased risk for SFV infection, possibly from contact with urine or feces from infected animals [13]. The natural habitats of C. angolensis, C. guereza, and C. ascanius include the forests of DRC, providing epidemiological support for our findings.
4. Discussion
We obtained and characterized three new SFV genomes from two colobine monkeys (Trachypithecus francoisi and Pygathrix nemaeus) naturally found in Asia and one from Africa (Colobus guereza) by using a detailed sequence analysis of PBL tissue culture isolates. Previously, the only complete SFV genomes from Asian monkeys were from a single genus, Macaca, severely limiting conclusions about SFV evolution and biology and the design of diagnostic assays. Using the new genomes, we improved our diagnostic assays and confirmed infection in additional people from DRC with NHP and forest exposures. The new colobine SFV genomes also enabled a better understanding of the evolutionary histories of these ancient viruses, which showed phylogenies not supporting the SFV host co-evolution model.
Overall, the three new SFV genomes were identical in structure to those of other SFVs and included relatively conserved structural, enzymatic, and auxiliary genes flanked by LTRs. Although the SFVtfr_083616 and SFVcgu_910916 genomes were intact, consisting of all five major coding regions, an in-frame Δtas mutation was present in the SFVpne_500057 genome that effectively eliminates the tas open reading frame with that of bet without the need for alternative splicing. This mutation has been reported to cause increased Bet production, favoring viral latency by downregulation of viral replication [46]. We confirmed the presence of this mutation and a wild-type SFV in Pne_500057 PBLs and a PBL tissue culture isolate. While some have suggested the Δtas mutation may be an adaptation to tissue culture, we and others have reported this mutation in both NHP and human ex vivo blood specimens, suggesting the Δtas also likely contributes to SFV latency in vivo by upregulation of Bet in the absence of Tas [42,43,45,46]. Further, viral latency, combined with immunological suppression by neutralizing antibodies, interferon, and host restriction by apolipoprotein B-editing catalytic polypeptide-like (APOBEC) deaminases, likely explains the low viral loads in zoonotically SFV-infected individuals and the absence of person-to-person transmission [66,67,68,69].
Increased risks of pathogen acquisition from colobine monkeys in Asia and Africa are caused by hunting and keeping them as pets and activities related to habitat loss or fragmentation caused by logging and agricultural expansion [13,39,70,71,72,73]. Occupational exposures to colobine monkeys can also occur at zoological gardens, where they are frequently kept in collections for their distinctive physical features and to promote primate conservation and education. Previously, we demonstrated that women in the DRC with NHP exposure were at increased risk for SFV infection, including two confirmed by molecular analyses to be infected with the SFV originating from Colobus angolensis and one with infection from Cercopithecus ascanius [13]. In the current study, we used a new generic SFV PCR assay developed with the new Colobinae genomes to re-test buffy coat DNA samples from 22 persons in DRC with primate exposure who were seroreactive for SFV, of whom 19 were originally PCR negative [13]. We demonstrated the new PCR assay to be significantly more sensitive than the original assay used for detecting Colobinae SFV. Our application of the new assay identified SFV sequences in four more people with infection originating from Colobus guereza, expanding the known diversity of SFV variants infecting humans. We also amplified longer pol sequences from the three SFV-infected people reported in our previous study, confirming their infections with Colobus and Cercopithecus SFVs. The habitat of C. guereza includes the forests of DRC, where our study was located, making the detection of this SFV strain in the four people who frequented the forests surrounding their villages plausible. Three of these four SFV-infected people reported direct NHP exposures, including butchering and eating NHPs. The fourth person did not report NHP exposure, but we have shown that simply entering forest environments places people at higher risk of SFV infection, likely from contact with body fluids of NHPs present in the forests [13]. Our inability to detect SFV sequences in the remaining 15 SFV seroreactive persons may indicate infection with other divergent SFVs not detected by our improved assay or low proviral loads, which are common with SFV infection [13,74].
Our detailed sequence analyses indicate SFV evolution has not strictly mirrored host phylogeny but rather has a more complex and dynamic history punctuated by host-switching events. Bayesian inference of gag-pol-env phylogenies showed that each new SFV was distinct from the others and previously reported SFV genomes from monkeys and apes. Surprisingly, all three colobinae SFV sequences demonstrated diverse evolutionary trajectories without forming a monophyletic clade, which would have been expected based on the prevailing SFV–host co-evolutionary dogma [12,16,40]. We confirmed these Colobinae SFV phylogenetic relationships by analysis of pol sequences from multiple colobine species with evidence of species-specific clades for Trachypithecus and African colobines. One exception was the placement of P. nemeaus SFV basal to Macaca SFV in the pol tree, compared with its position within the Macaca SFV clade in the gag-pol-env concatemer analysis, which was likely resolved by the inclusion of additional P. nemeaus SFV pol sequences. Our phylogenetic and TMRCA results do not mirror the evolutionary history of the colobine monkeys based on analysis of mitochondrial genomes, which proposes an origin in Africa 18–16 mya, with the colobus/procolobus divergence occurring in Africa around 9–7.5 mya [75]. These mitochondrial TMRCA estimates are consistent with the colobinae fossil record [56]. The Asian colobine ancestor was then inferred to have diverged from the African colobines 12–10 mya, followed by populating Eurasia [75]. About 2 mya later in Asia, the colobine monkeys diverged into the Semnopithecus, Trachypithecus/Presbytis, and the odd-nosed monkeys (Simias, Nasalis, Pygathrix, Rhinopithecus) [76]. In contrast to the colobine host evolutionary histories, the Trachypithecus SFV had the oldest estimated TMRCA (24–15 mya), suggesting that colobine SFVs may have originated in Asia and that African colobine SFVs represent a more recent divergence, potentially arising 8–14 mya from African Cercopithecinae.
Several factors may explain the unexpectedly deep divergences within colobine SFVs. One possibility is that the Trachypithecus SFVs reflect long-term virus–host co-divergence within Asian colobines, whereas the African colobine SFVs may have been acquired later through an ancient host switch from African cercopithecines. Alternatively, the deep divergence of the Trachypithecus SFV lineage may represent the retention of an older SFV lineage that arose early in cercopithecid evolution, with subsequent lineage loss or replacement in other Asian colobines. Under this scenario, the contemporary SFV distribution in colobines could reflect a combination of deep-time divergence, differential lineage survival, and limited cross-species transmission early in catarrhine dispersal between Africa and Eurasia, although the fossil record provides only indirect support for this broader geographic connectivity. This pattern is reminiscent of the phylogeny of squirrel monkey SFVs (Saimiri sciureus), which are basal to other platyrrhine SFVs rather than clustering with other Cebidae SFVs, indicating the persistence of deep viral lineages and/or ancient host switches [16,59,77]. Similarly, the basal position of Pygathrix SFVs relative to macaque SFVs suggests a shared ancestral SFV lineage or an early spillover event between colobines and macaques. The overlapping distributions of P. nemaeus and macaques in Southeast Asia raise the possibility that the SFV infection detected in Pne_500057 resulted from natural cross-species transmission rather than during captivity. If additional SFVs from other Asian colobines cluster with the Pygathrix SFV lineage, this would support the hypothesis that Old-World SFVs primarily group by host geography (Africa vs. Asia) rather than by strict host phylogeny, with the Trachypithecus SFVs representing an exception due to their unusually deep divergence. Together, these findings reveal a more complex and dynamic evolutionary history for SFVs in colobines than previously appreciated, including the potential for ancient host switches, deep viral lineage preservation, and hidden diversity in under-sampled Asian primates. This underscores the need to include diverse, under-sampled host species, such as colobines, in studies of SFV phylogenetics. Further sampling of Asian colobine SFVs, ideally from wild-caught individuals, would be essential to distinguish among these hypotheses and clarify the evolutionary history and host associations of SFVs in Old-World monkeys.
Our study has limitations. Interpretation of SFV phylogenies requires caution because divergence-time estimates for both viruses and hosts carry large credible intervals, and SFVs may experience lineage extinction, replacement, or unrecognized recombination [16,59,76,77,78,79]. Fossil evidence for early catarrhine dispersal across Africa and Eurasia is sparse and does not yet establish direct sympatry between African and Asian colobines [35,74,75]. Thus, alternative scenarios for the deep divergence of the Trachypithecus SFV lineage, including long-term co-divergence, ancient host switches, or retention of relict viral lineages, remain plausible but not definitively resolved. Broader geographic and taxonomic sampling of wild colobines, especially from underrepresented Asian lineages, will be crucial for distinguishing among these hypotheses and for refining the evolutionary history of SFVs in Old-World monkeys. Finally, the human samples tested were restricted to individuals from a specific region in the DRC, potentially missing SFV variants present in humans with different exposure histories. Despite improved PCR assays, SFV sequences were not detected in all seroreactive individuals, possibly due to low proviral loads or the presence of highly divergent SFV strains not targeted by the new assay. While our study identified new human infections, clinical and detailed contact information was not available to investigate clinical outcomes, transmission routes, or broader public health implications of SFV infection in humans.
In summary, we sequenced and analyzed three new SFV genomes from diverse Asian and African colobine monkeys, expanding the SFV genomic database. These new genomes enabled the development of improved diagnostic assays, which detected additional SFV infections in humans exposed to NHPs in the DRC. Genomic and phylogenetic analyses revealed that Colobinae SFVs have conserved genetic structures but follow complex evolutionary paths, including host switching, rather than strict co-evolution with their primate hosts. Our study highlights the need for broader SFV sampling to better understand viral evolution and zoonotic risks.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Calattini S. Nerrienet E. Mauclere P. Georges-Courbot M.C. Saib A. Gessain A. Natural simian foamy virus infection in wild-caught gorillas, mandrills and drills from Cameroon and Gabon J. Gen. Virol.2004853313331710.1099/vir.0.80241-015483245 · doi ↗ · pubmed ↗
- 2Calattini S. Wanert F. Thierry B. Schmitt C. Bassot S. Saib A. Herrenschmidt N. Gessain A. Modes of transmission and genetic diversity of foamy viruses in a Macaca tonkeana colony Retrovirology 200632310.1186/1742-4690-3-2316608518 PMC 1533860 · doi ↗ · pubmed ↗
- 3Goldberg T.L. Sintasath D.M. Chapman C.A. Cameron K.M. Karesh W.B. Tang S. Wolfe N.D. Rwego I.B. Ting N. Switzer W.M. Coinfection of Ugandan red colobus (Procolobus [Piliocolobus] rufomitratus tephrosceles) with novel, divergent delta-, lenti-, and spumaretroviruses J. Virol.200983113181132910.1128/JVI.02616-0819692478 PMC 2772775 · doi ↗ · pubmed ↗
- 4Jones-Engel L. Engel G.A. Heidrich J. Chalise M. Poudel N. Viscidi R. Barry P.A. Allan J.S. Grant R. Kyes R. Temple monkeys and health implications of commensalism, Kathmandu, Nepal Emerg. Infect. Dis.20061290090610.3201/eid 1206.06003016707044 PMC 3373059 · doi ↗ · pubmed ↗
- 5Khan A.S. Bodem J. Buseyne F. Gessain A. Johnson W. Kuhn J.H. Kuzmak J. Lindemann D. Linial M.L. Lochelt M. Spumaretroviruses: Updated taxonomy and nomenclature Virology 201851615816410.1016/j.virol.2017.12.03529407373 PMC 11318574 · doi ↗ · pubmed ↗
- 6Leendertz S.A. Junglen S. Hedemann C. Goffe A. Calvignac S. Boesch C. Leendertz F.H. High prevalence, coinfection rate, and genetic diversity of retroviruses in wild red colobus monkeys (Piliocolobus badius badius) in Tai National Park, Cote d’Ivoire J. Virol.2010847427743610.1128/JVI.00697-1020484508 PMC 2897606 · doi ↗ · pubmed ↗
- 7Liu W. Worobey M. Li Y. Keele B.F. Bibollet-Ruche F. Guo Y. Goepfert P.A. Santiago M.L. Ndjango J.B. Neel C. Molecular ecology and natural history of simian foamy virus infection in wild-living chimpanzees P Lo S Pathog.20084 e 100009710.1371/journal.ppat.100009718604273 PMC 2435277 · doi ↗ · pubmed ↗
- 8Morozov V.A. Leendertz F.H. Junglen S. Boesch C. Pauli G. Ellerbrok H. Frequent foamy virus infection in free-living chimpanzees of the Tai National Park (Cote d’Ivoire)J. Gen. Virol.20099050050610.1099/vir.0.003939-019141461 · doi ↗ · pubmed ↗
