Molecular Epidemiology of SARS-CoV-2 in Northern Greece from the Index Case up to Early 2025 Using Nanopore Sequencing
Georgios Meletis, Styliani Pappa, Georgia Gioula, Maria Exindari, Maria Christoforidi, Anna Papa

TL;DR
This study tracks the genetic evolution of SARS-CoV-2 in northern Greece from early 2020 to 2025, highlighting the spread of key variants and the importance of genomic surveillance.
Contribution
The study provides a five-year genomic analysis of SARS-CoV-2 in northern Greece, including detailed lineage tracking and insights into recombination events.
Findings
Early introductions included B.1, B.1.177, and B.1.258 lineages.
Omicron variants BA.4/BA.5 and JN.1 became dominant in 2022 and 2024, respectively.
Recombinants XDK, XDD, and XEC were identified and tracked using PANGO nomenclature.
Abstract
Background/Objectives: Since its emergence in late 2019, SARS-CoV-2 has demonstrated remarkable genetic diversity driven by mutations and recombination events that shaped the course of the COVID-19 pandemic. Continuous genomic monitoring is essential to track viral evolution, assess the spread of variants of concern (VOCs), and inform public health strategies. The present study aimed to characterize the molecular epidemiology of SARS-CoV-2 in northern Greece from the first national case in February 2020 through early 2025. Methods: A total of 66 respiratory samples collected from hospitalized patients across Northern Greece were subjected to whole-genome sequencing using Oxford Nanopore Technologies’ MinION Mk1C platform and the ARTIC protocol. Sequences were analyzed with PANGO, Nextclade, and GISAID nomenclature systems for lineage and clade assignment, and the WHO nomenclature for…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7- —European Union’s Horizon 2020 research and innovation programme
- —National Public Investment Program of the Ministry of Development and Investment/General Secretariat for Research and Technology
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsSARS-CoV-2 and COVID-19 Research · SARS-CoV-2 detection and testing · COVID-19 Clinical Research Studies
1. Introduction
Severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) has spread globally after emerging in China in 2019 and caused the coronavirus disease (COVID-19) pandemic with millions of deaths and enormous social and economic implications [1]. Since then, COVID-19 presented with variable symptomatology and severity (ranging from asymptomatic to acute respiratory distress syndrome) depending on the circulating variants and the interaction of the virus with human hosts [2].
SARS-CoV-2 is a single-stranded RNA virus of the genus Betacoronavirus in the family Coronaviridae. Over time, it has been subjected to continuous point-mutations that consequently resulted in the emergence of different lineages and variants [3]. Thus, during the pandemic and afterwards, numerous variants with different mutation combinations and variable clinical characteristics have emerged [4], posing the need for monitoring, classification and nomenclature systems.
The Global Initiative on Sharing All Influenza Data (GISAID) was established in 2008, in response to the H5N1 influenza pandemic, and its initial aim was to provide open access to genomic data of influenza viruses [5]. Following the identification of COVID-19 as a newly emerging viral respiratory disease, GISAID established the EpiCoV™ platform, to ensure open access to data and to overcome hurdles and restrictions which discouraged or prevented prompt dissemination of virological data prior to official publication. Nowadays, GISAID maintains the world’s largest repository of SARS-CoV-2 sequences, including related clinical, epidemiological and geographical data.
The Nextstrain pathogen surveillance platform (Nextstrain.org) offers real-time views of evolving pathogen populations through interactive visualizations, enabling users to explore datasets and analyses that are continuously updated as new genomic data become available [6]. In early 2020, Nextstrain introduced informal clade labels for SARS-CoV-2 to facilitate URL links that provided an “automatic zoom” to specific regions of the phylogenetic tree. These labels, composed of adhoc letter–number combinations, were not intended to serve as a permanent naming system.
For SARS-CoV-2, the Phylogenetic Assignment of Named Global Outbreak (PANGO) dynamic nomenclature [7] is widely recognized as the standard system for classifying and naming genetically distinct lineages, based on analyses of complete or near-complete viral genomes. Within this framework, new lineages are assessed by a committee according to criteria such as novel evolutionary traits, transmissibility, pathogenicity, significant increases in frequency, and evidence of spread across regions [8]. Although scientifically rigorous, it was quickly acknowledged that the PANGO naming scheme could cause confusion, particularly in non-technical discussions or when conveying information to the general public.
The World Health Organization (WHO) established a classification system for SARS-CoV-2 variants based on their actual or anticipated clinical and epidemiological impact on public health (https://www.who.int/activities/tracking-SARS-CoV-2-variants accessed on 3 September 2025). Variants were grouped into categories such as variants under monitoring (VUM), variants of interest (VOI)—defined by features requiring ongoing surveillance and further assessment—and variants of concern (VOC), which demonstrated increased transmissibility, virulence, or pathogenicity, and in some cases showed partial resistance to treatment or immune protection from vaccination or prior infection [9,10,11]. To facilitate communication with healthcare providers and the general public, WHO also introduced a simplified nomenclature using Greek alphabet labels, applied only to major SARS-CoV-2 lineages that significantly influenced the course of the pandemic. The most clinically important variants designated with Greek letters were Alpha, Beta, Gamma, Delta, and Omicron. Since March 2023, WHO has emphasized the VUM, VOI and VOC categories and Greek letters were reserved only for newly designated VOCs. Later Omicron sublineages (e.g., EG.5, JN.1, KP.3, KS.1) are therefore reported by their PANGO names with their WHO category at the time of circulation and are not assigned new Greek letters.
The emergence of Omicron and its subsequent variants was associated with overall less virulence and triggered the end of the pandemic status. However, the molecular characterization of SARS-CoV-2was kept on by reference laboratories in order to monitor the evolution of the virus. Moreover, in March 2024, WHO launched a coronavirus network (CoViNet) to enhance coordination and support for the early, accurate detection, monitoring, and assessment of SARS-CoV-2, MERS-CoV, and emerging coronaviruses of public health significance (https://www.who.int/news/item/27-03-2024-who-launches-covinet--a-global-network-for-coronaviruses accessed on 3 September 2025).
The aim of the present study was to apply a next-generation sequencing (NGS) workflow for identifying various SARS-CoV-2 variants circulating in northern Greece, in order to obtain whole-genome sequences (WGS) from SARS-CoV-2-positive samples in various time periods starting from the first COVID-19 case in Greece and expanding up to early 2025.
2. Materials and Methods
2.1. Sample Collection and Preparation
The study was carried out at the Microbiology laboratory, School of Medicine, Aristotle University of Thessaloniki, Greece. Nasopharyngeal and oropharyngeal swabs were sent to the National Influenza Centre for Northern Greece from infected patients with SARS-CoV-2 hospitalized in various hospitals across northern Greece. Viral RNA was extracted using the Nucleic Acid Extraction Kit (Magnetic Bead Method) by Zybio Inc. (Chongqing, China) according to the manufacturer’s protocol, and a reverse transcriptase-polymerase chain reaction (RT-PCR) was applied for the detection of SARS-CoV-2 (TaqPath COVID-19 CE-IVD RT-PCR Kit, Thermo Fisher Scientific, Waltham, MA, USA). A total of 66 respiratory samples with Ct values of 20–29 in diagnostic RT-PCR were selected following a convenience strategy for NGS. All positive SARS-CoV-2 samples were stored at −80 °C until further use.
2.2. Library Preparation/Sequencing Using MinION Mk1C and GridION
An amplicon-based NGS protocol was applied to amplify overlapping segments of the viral genome of ~400 bp in length each, to obtain the complete genome sequence of SARS-CoV-2 (~30,000 bp). The sequencing was performed on the MinION platform of Oxford Nanopore Technologies (ONT, Oxford, UK). Various bioinformatic programs were applied, and the obtained sequences were compared with respective ones from the GISAID Bank.
In the present study we applied the PCR tiling of SARS-CoV-2 virus-classic (SQK-LSK109 with EXP-NBD104-114) sequencing protocol (Oxford Nanopore Technology-ONT, Oxford, UK) using the primers V3 and V4.1 developed by the ARTIC Network (https://artic.network accessed on 3 September 2025) and the third-party reagents from New England Biolabs (NEB, Ipswich, MA, USA). The consensus sequence was determined using reference-based alignment. All sequencing runs contained a negative control sample and were performed using the ligation sequencing kit 109 with Native Barcoding Expansion 1-12(EXP-NBD104) and 13-24(EXP-NBD114) (SQK-LSK109, EXP-NBD104-114, ONT, Oxford, UK). Before cDNA synthesis, the samples with Ct values 12–15 were diluted 1:100 with nuclease-free water; for Ct value15–18 they were diluted at 1:10, while when the Ct value was 18–32 they were used undiluted. cDNA synthesis was performed using 16 μL of viral RNA mixed with 4 μL of LunaScript RT SuperMix (M3010, NEB, Ipswich, MA, USA), following the manufacturers guidelines. Following cDNA synthesis, the overlapping amplicons (400 bp) were generated using primer pool V3 or V4.1 (ARTIC nCoV-2019 V3 Panel and ARTIC nCoV-2019 V4.1 Panel, IDT, Coralville, IA, USA). 15 ng of the prepared library was loaded in a total volume of 75 μL onto a primed R9.4.1 or R10.4 flow cell (ONT, Oxford, UK) installed in a MinION Mk1C or GridION (ONT, Oxford, UK) device and run for 18 h following the protocol of use as defined by the manufacturer.
Basecalling and demultiplexing were conducted using the MinKnow software 24.06.5 which is embedded within the MinION Mk1C and GridION devices (ONT, Oxford, UK), barcodes were trimmed. A first quick analysis was performed with the EPI2MEAgent cloud-based platform. Subsequent data analysis was also performed on the EPI2ME platform, developed by Metrichor Ltd. (Oxford, UK), incorporating Artic, Nextclade and Pagolin tools. The ARTIC pipeline assessed the depth of coverage for each barcoded sample and facilitates the examination of specific amplicons that may not have been successfully amplified by both primer pools. Nextclade identified genetic variants by comparison with the reference genome and provided quality control metrics. Pangolin was used to determine the lineage of each sample. The reference genome utilized for SARS-CoV-2 analysis was derived from the Wuhan strain (GenBank Accession number MN908947).
3. Results
Whole genome sequences (WGSs) were taken from sixty-six SARS-CoV-2 positive samples starting from February 2020, including the first COVID-19 case in Greece (Sample 141). Genome coverage greater than 90% and more than 30X depth were used as criteria for successful sequencing. All samples were classified according to PANGO, WHO (where applicable), GISAID and Nextstrain (Table 1).
Figure 1 displays a map of Greece with the number of samples included in the study per year and in total.
Thirty-four different lineages were identified according to the PANGO lineage system (Figure 2), highlighting the continuous evolution of the virus in northern Greece during the five years of the study. The VOCs according to the WHO nomenclature were Alpha and Beta between February and June 2021 (5/66 each); samples before February 2021 were not assigned as VOCs. Delta (AY.43) appeared briefly in January–February 2022 (2/66; 22.2% of genomes in that period) but was rapidly displaced by Omicron, which accounted for 77.8% (7/9) of genomes in January–February 2022 and reached 100% by May 2022. Omicron subsequently diversified into BA.1/BA.1.1 (3/66), BA.2 (6/66), BA.4/BA.5 (14/66), BF.5 (1/66), EG.5 (1/66), JN.1 (4/66), KS.1 (2/66), KP.3 (5/66), and recombinants XDK, XDD, and XEC (5/66). A year-by-year summary of lineage counts is provided in Supplementary Table S1.
4. Discussion
The present study provides a detailed molecular epidemiological analysis of SARS-CoV-2 circulation in various time periods in northern Greece, spanning from the first recorded COVID-19 case in February 2020 through early 2025. By employing Nanopore sequencing with the MinION Mk1C and the ARTIC protocol, we were able to track the evolution and succession of viral lineages across a five-year period in our geographic area. Our results demonstrate the dynamic nature of SARS-CoV-2, from early variants, such as Alpha and Beta, to the global dominance of Omicron and its sub-lineages, as well as the emergence of recombinant forms. These findings are consistent with the global epidemiological observations.
The first confirmed case of COVID-19 in Greece was identified in Thessaloniki on 26 February 2020 and was detected in our lab (Specimen ID 141). The patient had returned from Italy, where the COVD-19 outbreak had already initiated [12]. Soon, the first pandemic wave was initiated in Greece, and lasted until 3 May 2020 [13]. According to the present study, this first sequence belonged to the B.1 lineage, in alignment with other virus introductions in Europe from Italy that time [14]. It is classified within GISAID clade G, defined by the D614G spike mutation, which results from an A-to-G nucleotide substitution at position 23,403 in the Wuhan reference strain [15]. This mutation rapidly became dominant in the first half of 2020 due to enhanced transmissibility [16]. The detection of B.1.177 and B.1.258 lineages in late 2020 further reflects the European epidemiological pattern, as these variants were widespread across the continent during the second wave [17,18]. Accordingly, the Nextstrain 20E (EU1 cluster) that derived from 20A by an additional mutation (spike A222V) consisted of lineage B.1.177 and its sub-lineages [7], underscoring the importance of cross-border virus circulation. Interestingly, the B.1.1.318 that was notably prevalent in Greece in early 2021 [19] is absent from our dataset, probably due to our hospital-based sampling.
Our study demonstrates that by early 2021, the Alpha (B.1.1.7) and Beta (B.1.351) variants circulated in northern Greece. The Alpha variant, first identified in the United Kingdom, was associated with increased transmissibility and higher viral loads [20,21], and soon became predominant across Europe [22]. In parallel, the Beta variant, first described in South Africa, was of particular concern due to immune escape mutations (K417N, E484K, N501Y) in the spike protein [23]. The simultaneous detection of both variants locally suggests multiple introductions and heterogeneous transmission chains, as has been reported in other European countries [24].
The global spread of the Delta variant (B.1.617.2 and sub-lineages) marked a turning point in the pandemic, associated with higher transmissibility and increased risk of severe disease compared to Alpha [22,25]. In our dataset, Delta lineages such as AY.43 were identified in 2022. Moreover, our results highlight the rapid replacement of Delta by Omicron that occurred between late 2021 and early 2022 in Greece [26] (BA.1 detected in January 2022 and BA.2 in February 2022). This pattern mirrors the global trend, as Omicron rapidly achieved dominance due to its substantial immune evasion capabilities [27]. While Omicron was associated with reduced severity compared to Delta [28], its high transmissibility and the sheer number of infections led to significant morbidity and mortality, especially among unvaccinated and high-risk individuals [29].
Our study captured the progressive diversification of Omicron into BA.4.1, BA.5, and subsequent sub-lineages, including BF.5, EG.5, JN.1, KS.1, and KP.3. These findings underscore the virus’s ongoing adaptability in the face of widespread vaccination and natural immunity. Recent data suggest that Omicron sub-lineages exhibit convergent evolution at key antigenic sites, such as receptor-binding domain (RBD) mutations facilitating immune escape [30]. The detection of recombinant lineages in our dataset is also consistent with global reports of recombination as a significant evolutionary mechanism in SARS-CoV-2 [31]. Recombination can confer selective advantages by combining immune-evasive and transmissible features from different parental strains, warranting continuous genomic surveillance. In line with WHO’s current framework, these are described by their PANGO names and were classified as VOI or VUM during 2023–2025, without assignment of new Greek letters. Epidemiologically, EG.5 was designated a WHO VOI on 9 August 2023 [32], JN.1 became globally dominant and informed 2024–2025 vaccine updates [33], and KP.3/KP.3.1.1 were prominent “FLiRT” lineages during 2024, while KS.1/KS.1.1 remained less prevalent PANGO lineages [34].
The use of Oxford Nanopore technology (ONT) allowed rapid and cost-effective sequencing, even under resource-limited conditions. Nanopore sequencing is increasingly adopted for real-time outbreak monitoring due to its portability and ability to generate complete viral genomes within hours [35,36]. While Illumina sequencing remains the gold standard for accuracy, ONT has proven valuable for genomic surveillance, particularly when rapid turnaround is essential [37]. In our study, ONT enabled continuous monitoring across a five-year period, supporting its feasibility for surveillance in specialized or even decentralized laboratories. Furthermore, integration with open-source tools facilitated lineage assignment and contextualization within global phylogenies.
Some limitations of the study must be acknowledged. For a five-year period, the sample size was relatively small (n = 66), thus limiting the representativeness of the findings. Larger-scale sequencing efforts would better capture the full diversity of circulating lineages. Second, sampling was biased mostly towards hospitalized patients, which may not fully reflect community-level transmission dynamics. Third, given the fast pace of viral evolution, some transient lineages may have gone undetected due to limited temporal and geographical sampling. Finally, while ONT sequencing provides rapid data, it is prone to higher error rates compared with Illumina platforms [38].
Despite these limitations, the present study demonstrates the value of continuous genomic surveillance in understanding the evolutionary trajectory of SARS-CoV-2. Our findings reinforce the necessity of sustaining sequencing capacity beyond the acute pandemic phase. Genomic monitoring is particularly critical for detecting immune-escape variants, recombinants and lineages with altered clinical or epidemiological characteristics. The recent establishment of WHO’s CoViNet provides an opportunity to integrate regional surveillance efforts into a global framework, enabling rapid data sharing and coordinated response strategies.
The COVID-19 pandemic prompted the application of new sequencing technologies and bioinformatic programs which can be used for additional diseases as molecular epidemiology carries significant implications for public health policy. The dynamic replacement of lineages and the repeated introduction of variants into northern Greece illustrate that viral evolution cannot be considered independent from human mobility, population immunity, and intervention strategies. Regional outbreaks were often linked to international travel and local social dynamics, suggesting that timely genomic data, when integrated with epidemiological investigations, can provide actionable insights for containment. Strengthening the interface between genomic surveillance and public health decision-making could shorten response times to emerging threats.
In addition, the gradual transition from the pandemic emergency phase to endemic circulation of SARS-CoV-2 underscores the need to reshape surveillance frameworks. Rather than maintaining virus-specific monitoring systems, integrated platforms that simultaneously track SARS-CoV-2, influenza, respiratory syncytial virus (RSV), and other respiratory viruses provide a more sustainable and cost-effective model [39]. Currently the genomic surveillance is integrated in Greece with other respiratory viruses, such as influenza and RSV, to establish comprehensive respiratory virus surveillance networks.
The role of recombination in SARS-CoV-2 evolution warrants particular attention, as recombinants may pose unpredictable risks. In addition, the combination of genomic data with serological studies and vaccination records can provide insights into the interplay between immunity and viral evolution.
Numerous studies, especially during the COVID-19 pandemic, have shown the value of portable and rapid sequencing technologies. Especially when the results are combined with open-access databases, these can establish resilient surveillance networks capable of detecting convergent evolution, recombination, and immune-escape mutations in real time. The lessons learned from SARS-CoV-2 emphasize that preparedness requires not only technological readiness, but also sustained investment, international collaboration, and transparent data sharing. By embedding genomic surveillance within broader public health infrastructure, countries can move from reactive crisis management toward proactive prevention of future pandemics. In general, the lessons learned from SARS-CoV-2 genomic surveillance could be applied proactively to prepare for potential future pandemics caused by novel coronaviruses or other emerging pathogens.
5. Conclusions
This study provides a comprehensive molecular epidemiological overview of SARS-CoV-2 circulation in northern Greece over a five-year period starting from the index Greek case. By employing Nanopore sequencing with the ARTIC protocol, we were able to document the successive waves of viral variants, from early B.1 lineages to the global dominance of Omicron and its sub-lineages, as well as the emergence of recombinant forms. The evolutionary course of SARS-CoV-2 is shaped by many factors including international travel, regional connectivity, local transmission dynamics, as well as the influence of public health measures and vaccination campaigns. The simultaneous presence of multiple lineages reflects heterogeneous transmission chains and underlines the importance of continued genomic surveillance. In conclusion, sustained genomic monitoring, integrated with epidemiological data, remains essential to detect emerging variants, guide public health responses and strengthen pandemic preparedness.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Worldometer-Coronavirus Available online: https://www.worldometers.info/coronavirus(accessed on 2 September 2025)
- 2Guan W.J. Ni Z.Y. Hu Y. Liang W.H. Ou C.Q. He J.X. Liu L. Shan H. Lei C.L. Hui D.S.C. Clinical characteristics of Coronavirus Disease 2019 in China N. Engl. J. Med.2020382170817203210901310.1056/NEJ Moa 2002032 PMC 7092819 · doi ↗ · pubmed ↗
- 3Lippi G. Mattiuzzi C. Henry B.M. Updated picture of SARS-Co V-2 variants and mutations Diagnosis 2021911173495818410.1515/dx-2021-0149 · doi ↗ · pubmed ↗
- 4Gong W. Parkkila S. Wu X. Aspatwar A. SARS-Co V-2 variants and COVID-19 vaccines: Current challenges and future strategies Int. Rev. Immunol.20234239341410.1080/08830185.2022.207964235635216 · doi ↗ · pubmed ↗
- 5Nakagawa S. Miyazawa T. Genome evolution of SARS-Co V-2 and its virological characteristics Inflamm. Regen.2020401710.1186/s 41232-020-00126-732834891 PMC 7415347 · doi ↗ · pubmed ↗
- 6Hadfield J. Megill C. Bell S.M. Huddleston J. Potter B. Callender C. Sagulenko P. Bedford T. Neher R.A. Nextstrain: Real-time tracking of pathogen evolution Bioinformatics 2018344121412310.1093/bioinformatics/bty 40729790939 PMC 6247931 · doi ↗ · pubmed ↗
- 7Rambaut A. Holmes E.C. O’TooleÁ. Hill V. Mc Crone J.T. Ruis C. du Plessis L. Pybus O.G. A dynamic nomenclature proposal for SARS-Co V-2 lineages to assist genomic epidemiology Nat. Microbiol.20205140314073266968110.1038/s 41564-020-0770-5PMC 7610519 · doi ↗ · pubmed ↗
- 8O’TooleÁ. Scher E. Underwood A. Jackson B. Hill V. Mc Crone J.T. Colquhoun R. Ruis C. Abu-Dahab K. Taylor B. Assignment of epidemiological lineages in an emerging pandemic using the pangolin tool Virus Evol.20217 veab 06410.1093/ve/veab 06434527285 PMC 8344591 · doi ↗ · pubmed ↗
