Coxsackievirus A24 variant whole genome sequencing from clinical samples using a three overlapping amplicons strategy
John Mwita Morobe, Samuel Odoyo, Arnold W. Lambisia, Edidah Moraa, Charlotte J. Houldcroft, Edward C. Holmes, George Githinji, Charles N. Agoti, Sanjay Tikute, John Morobe, Peter van Heusden, John Morobe

TL;DR
Researchers developed a sequencing method to study the coxsackievirus A24 variant causing an eye disease outbreak in Kenya.
Contribution
A novel whole-genome sequencing protocol for CA24v using overlapping amplicons was developed and applied to clinical samples.
Findings
Three near-complete CA24v genomes were recovered from clinical samples during the 2024 outbreak.
The amplicon-based sequencing approach is rapid, cost-effective, and suitable for resource-limited settings.
The protocol supports genomic surveillance and evolutionary studies of CA24v in Kenya and beyond.
Abstract
In January 2024, the Kenya Ministry of Health issued an outbreak alert following a surge in acute hemorrhagic conjunctivitis (AHC) cases along the Kenyan coast. Our investigations identified coxsackievirus A24 variant (CA24v) as the causative agent. In this study, we developed a whole genome sequencing assay for CA24v, and used it to recover three near complete genomes from the 2024 AHC outbreak in Kenya. This assay will support studies on CA24v genomic epidemiology and evolution across Kenya and beyond. In January 2024, an outbreak of acute hemorrhagic conjunctivitis (AHC) along the Kenyan coast prompted an alert from the Ministry of Health. AHC is characterized by the sudden onset of painful, swollen, and red eyes, often with subconjunctival haemorrhage and a foreign body sensation. Our investigations identified coxsackievirus A24 variant (CA24v) as the causative agent. To…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2| Amplicon Name | Primer
| Strand | Melting
| Position Covering
| Sequence (5'-3') | Product
| PCR Cycle Condition |
|---|---|---|---|---|---|---|---|
| Amplicon 1 | CVA24v_
| + | 63 | 88 | ATACCCCTTCCCCACGTAACTT | 2577 |
|
| CVA24v_
| - | 63 | 2644 | CCAGATGCACCGGTCTCTAC | |||
| Amplicon 2 | CVA24v_
| + | 58.4 | 2420 | TTTTAGTGTGCGTTTATTGAGAGACAC | 2700 |
|
| CVA24v_
| - | 58.7 | 5119 | GCCTCCATACAATTCCCAATG | |||
| Amplicon 3 | CVA24v_
| + | 59.5 | 2644 | GTATTGGCCTCAACAAACTCACA | 2643 |
|
| CVA24v_
| - | 62.2 | 7437 | CCCCTACAACAGTATAACCCAATCC |
| Sequence | Number of
| Sequence
| Coverage (%)
| GenBank
| SRA Accession
|
|---|---|---|---|---|---|
| KEN/MSA/2/02-2024 | 8923 | 7304 | 98.13 | SRX26951182 | |
| KEN/MSA/4/02-2024 | 1151 | 5113 | 70.00 | SRX26951184 | |
| KEN/MSA/14/02-2024 | 16739 | 7304 | 98.13 | SRX26951185 | |
| NC | 29 | - | - | - | - |
- —National Institute of Health and Care Research
- —Wellcome Trust
- —Rockefeller Foundation
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsViral Infections and Immunology Research · Infective Endocarditis Diagnosis and Management · Cytomegalovirus and herpesvirus research
Introduction
Coxsackievirus A24 variant (CA24v) is a member of species Enterovirus coxsackiepol, genus Enterovirus, family Picornaviridae, and a leading cause of acute haemorrhagic conjunctivitis (AHC) outbreaks in the tropics, also referred to as "red eye or pink eye" disease ^ 1– 4 ^. The CA24v genome comprises a single-stranded, positive-sense RNA molecule of approximately 7,400 bp in length and encodes 4 structural proteins (VP4, VP2, VP3, and VP1) and 7 non-structural proteins (2A-2C, and 3A-3D). To date, eight genotypes of CA24v (GI–GVIII) have been described, based on sequence homology within the VP1 gene ^ 5 ^.
As of 25 ^th^ April 2025, fewer than 119 complete or near complete genomes (>90% coverage) of CA24v are publicly available in GenBank sampled between 1952 and 2024 representing data from 25 countries. This is a relatively small number compared to the number of completer genomes for other outbreak viruses including, influenza A (~165,000), monkeypox virus (~8,500), SARS-CoV-2 (~17,000,000) and Ebola virus (~3,400). Even among Enterovirus genus, CA24v remains poorly represented. For example, approximately ~1,875 complete genomes are available for Enterovirus A71, 1,657 for Enterovirus D68, and 347 for Poliovirus type 1. The small number of publicly available CA24v genomes is in part explained by limited availability of diagnostic capacity during outbreaks, rarity of the infection, self-limiting nature of the AHC condition, and absence of a simple cost-effective genome sequencing methods ^ 6, 7 ^. This paucity of CA24v genomic data also limits our understanding of CA24v diversity, evolution and epidemiology ^ 7 ^. Previous efforts to generate CA24v genomes have relied on metagenomic sequencing and primer walking approaches ^ 2, 6 ^. However, these approaches are relatively expensive, technically demanding, and require significant hands-on time in the laboratory ^ 8 ^. An alternative approach is an overlapping amplicon sequencing strategy in which the pathogen genome is amplified as series of tiled fragments which are then sequenced and reassembled ^ 9 ^. This strategy has been successfully used on several viral pathogens, including enteroviruses such a Rhinovirus A15 and A105 ^ 10 ^, Enterovirus D68 ^ 11 ^, Echovirus 30 ^ 12 ^ and Coxsackievirus B5 ^ 13 ^. Herein, we present a tiled amplicon approach for CA24v sequencing, developed in response to the 2024 acute hemorrhagic conjunctivitis (AHC) outbreak in coastal Kenya ^ 14, 15 ^, to enable high-throughput recovery of viral genomes directly from clinical samples and support future real-time genomic surveillance.
Methods
We used the Primal scheme algorithm with default parameters ^ 9 ^ and identified 12 primers (six pairs) that could bind to various positions within the CA24v genome. The input alignment utilized currently available CA24v genomes (>95% coverage) in GenBank. This selection of primers aimed to have pairs that produce an amplicon size of ~2500 nucleotides. Following laboratory optimisation, six primers that result in three overlapping amplicons were selected ( Table 1; Figure 1A). The resultant amplicons had overlapping regions of 222 nt between amplicon 1 and 2, and 325 nt between amplicon 2 and 3 ( Table 1; Figure 1A). This set was used to amplify viral RNA extracted from three CA24v positive ocular samples identified in early February 2024 on the Kenyan Coast ^ 14 ^, that had a diagnostic cycle threshold (Ct) of 32.29, 38.43, and 37.65 following qPCR.
Genome maps.( A) Schematic representation showing the position of overlapping amplicons in the CA24v genome (GenBank accession number PP548240). ( B) Coverage plots for KEN/MSA/2/02-2024, KEN/MSA/4/02-2024 and KEN/MSA/14/02-2024.
Viral RNA was extracted from the three ocular samples using the QIAamp Viral RNA Mini Kit (Qiagen) and reverse transcribed using the LunaScript® RT SuperMix Kit (New England Biolabs). A negative control (NC) (nuclease-free water) was included during both the extraction and reverse transcription steps. The cDNA was then amplified in three reaction tubes with the Q5® Hot Start High-Fidelity 2Master Mix (NEB) using the newly designed and optimised CA24v primers and thermocycling conditions as shown in Table 1. The PCR products were loaded on a 1.5% agarose gel to confirm amplification before purification using Agencourt AMPure XP beads. Library preparation was performed using the Ligation Sequencing Kit (SQK-LSK114) and Native Barcoding Kit (NBD96), and sequencing performed on the Oxford Nanopore Technologies (ONT) GridION platform.
Genome assembly was performed using a sub-workflow of an in-house pipeline named " ViralPhyl" and available on GitHub ( https://github.com/kwtrp-peo/viralphyl). Base-called reads were demultiplexed using the ARTIC Guppyplex tool with default parameters, applying a minimum Q score of 9. Reads shorter than 500 nt were filtered out using the toullingQC module. Consensus sequences were generated by aligning the reads to a reference sequence (in this case CVA24_2400060741_FRA24, GenBank accession PP548240). The reads were aligned using MiniMap2 ^ 16 ^. Positions with genome coverage below 20 reads were masked with 'N'. The resulting consensus sequences were further refined using Medaka v 2.0.1 to correct potential sequencing errors. The recovered genome sequences were combined with publicly available CA24 genomes and aligned using MAFFT v7.5201 ^ 17 ^. A maximum likelihood (ML) phylogenetic tree was inferred using IQ-TREE v2.1.3 ( http://www.iqtree.org/) applying the GTR substitution model, with branch support assessed using 1000 bootstrap iterations. Nucleotide and amino acid variations between the newly sequenced genomes were analyzed using Snipit v1.6 ^ 18 ^. The analysis was performed with input options --sequence-type nt for nucleotide variation and --sequence-type aa for amino acid variation.
Results and discussion
Two of the recovered genome sequences (KEN/MSA/2/02-2024 and KEN/MSA/14/02-2024) were 7,304 nucleotides (nt) in length ( Table 2), comprising the 5′ untranslated region (UTR) of 637 nt, complete open reading frame (ORF) of 6,645 nt, and 3′-UTR of 22 nt. Sequence KEN/MSA/4/02-2024 contained a 2,191-nucleotide gap within the VP1, 2A and 2B regions of the ORF due to amplicon 2 dropout ( Table 2, Figure 1B), likely due to low viral load, as indicated by a high Ct value > 38.43 in this sample. The new genomes generated here were classified as genotype IV, and clustered in clade comprising sequences sampled in Mayotte, an overseas department and region of France in February 2024 ( Figure 2A). The recovered genomes displayed nucleotide variations (n=25) across the entire genome ( Figure 2B). However, only synonymous mutations were observed (i.e., no amino acid substitutions) indicating conservation at the protein level despite genetic diversity.
( A) Maximum likelihood phylogenetic tree based on the genome sequences of CA24v from this study (n=3) and previous outbreaks (n=119). ( B) Nucleotide alignment showing the nucleotide variations across the three CA24v genomes, with KEN/MSA/2/02-2024 as the reference sequence.
This sequencing assay has some limitations. First, the selected primers did not capture the entire 5' and 3' UTR regions because the primers bind to regions within the UTRs, rather than at the terminal ends. Second, substantial genetic diversity exists within CA24v, yet our primers have only been tested with genotype IV which is the most commonly detected in recent studies. Future testing against a larger sample set and diverse CA24v genotypes is needed to confirm similar performance across different genotypes.
In summary, we present a simple tiled-amplicon-based whole genome sequencing protocol for CA24v, that has great potential to support future studies on the genomic epidemiology of CA24v.
Ethical approval
The samples analysed here were collected as part of Ministry of Health outbreak response activities to the AHC outbreak and as such written informed consent is not considered an essential step prior specimen collection. In such cases, individual consent is typically not required. Our analysis presented data that has been adequately anonymized as approved by the Institutional Review Board (IRB), allowing us to publish the outcomes of the outbreak investigations. The processing and sequencing of these samples was approved by the Institutional Review Board (IRB), including the waiver of individual consent given the public health emergency context. The molecular diagnostics and sequencing in scenarios of outbreak response by Kenya Medical Research Institute (KEMRI) - Wellcome Trust Research Programme (KWTRP) was approved by KEMRI Scientific Ethics Review Unit (SERU) Committee based in Nairobi, Kenya on May 19th, 2024 (Protocol #: KEMRI/SERU/CGMR-C/304/4894).
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Shrestha R Rijal RK Kausar N : Acute Conjunctivitis among patients visiting the outpatient Department of Ophthalmology in a tertiary care centre. JNMA J Nepal Med Assoc. 2024;62(269):24–26. 10.31729/jnma.8391 38410017 PMC 10924492 · doi ↗ · pubmed ↗
- 2Tran H Ha T Hoang L : Coxsackievirus A 24 causing acute conjunctivitis in a 2023 outbreak in Vietnam. Int J Infect Dis. 2024;146: 107133. 10.1016/j.ijid.2024.107133 38876162 PMC 11847566 · doi ↗ · pubmed ↗
- 3Shrestha E Katuwal N Kharel Sitaula R : Investigation of the causative pathogen in the 2023 conjunctivitis outbreak of Nepal using unbiased metagenomic Next Generation Sequencing. med Rxiv. 2024; 2024.04.16.24305920. 10.1101/2024.04.16.24305920 · doi ↗
- 4Prajna NV Prajna L Teja V : Apollo rising: acute conjunctivitis outbreak in India, 2022. Cornea Open. 2023;2(2):e 0009. 10.1097/coa.0000000000000009 37719281 PMC 10501505 · doi ↗ · pubmed ↗
- 5Kroneman A Vennema H Deforche K : An automated genotyping tool for enteroviruses and noroviruses. J Clin Virol. 2011;51(2):121–125. 10.1016/j.jcv.2011.03.006 21514213 · doi ↗ · pubmed ↗
- 6Haider SA Jamal Z Ammar M : Genomic insights into the 2023 outbreak of Acute Hemorrhagic Conjunctivitis in Pakistan: identification of Coxsackievirus A 24 variant through next generation sequencing. med Rxiv. 2023; 2023.10.11.23296878. 10.1101/2023.10.11.23296878 · doi ↗
- 7Fonseca MC Pupo-Meriño M García-González LA : Molecular evolution of coxsackievirus A 24v in Cuba over 23–years, 1986–2009. Sci Rep. 2020;10(1): 13761. 10.1038/s 41598-020-70436-w 32792520 PMC 7427094 · doi ↗ · pubmed ↗
- 8Yek C Pacheco AR Vanaerschot M : Metagenomic pathogen sequencing in resource-scarce settings: lessons learned and the road ahead. Front Epidemiol. 2022;2: 926695. 10.3389/fepid.2022.926695 36247976 PMC 9558322 · doi ↗ · pubmed ↗
