Diversity of Cardinium Endosymbiont Genomes from Plant-Parasitic Nematodes
Sergey V. Tarlachkov, Alexander Y. Ryss, Yury Y. Ilinsky, Dmitry A. Rodionov, Lydmila I. Evtushenko, Sergei A. Subbotin

TL;DR
This study explores the diversity of Cardinium bacteria in plant-parasitic nematodes, revealing their evolutionary relationships and metabolic adaptations.
Contribution
The paper provides ten new Cardinium genomes and clarifies the evolutionary origin of groups B and F within nematode-associated Cardinium.
Findings
Cardinium genomes from nematodes show broad ecological and phylogenetic distribution.
Two Bursaphelenchus-associated Cardinium genomes exhibit significant genome reduction.
Horizontal gene transfer has contributed to key metabolic pathways in Cardinium.
Abstract
Cardinium endosymbionts are obligate intracellular bacteria found in a wide range of invertebrate hosts. In this study, we generated ten new Cardinium genomes from plant-parasitic nematodes of the genera Amplimerlinius, Bursaphelenchus, Cactodera, Ditylenchus, Globodera, Meloidoderita, and Rotylenchus, revealing their broad ecological and phylogenetic distribution. Using an expanded set of genes, we clarified the relationship between previously defined Cardinium groups B and F from nematodes, showing that they are closely related and likely share a single evolutionary origin within nematode-associated Cardinium. Among the newly assembled Cardinium genomes obtained in this study, two genomes originating from strains associated with wood-inhabiting Bursaphelenchus species exhibited remarkable genome reduction, with estimated sizes of approximately 695 kb. Functional annotation of…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11
Figure 12
Figure 13
Figure 14
Figure 15
Figure 16- —Ministry of Science and Higher Education of the Russian Federation
- —State Assignment
- —USDA APHIS FarmBill
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsNematode management and characterization studies · Entomopathogenic Microorganisms in Pest Control · Insect symbiosis and bacterial influences
1. Introduction
Bacteria belonging to the genus Candidatus Cardinium (Bacteroidota) and related organisms composing the Cardinium clade are intracellular endosymbionts that are widely spread among a diverse range of invertebrate hosts. These include arachnids, insects, copepods, ostracods, and mussels, as well as certain plant-parasitic nematodes [1,2,3,4,5,6,7,8]. Cardinium is best known for its ability to manipulate host reproduction through a variety of mechanisms: cytoplasmic incompatibility, parthenogenesis induction, and feminization [8]. Endosymbionts can strongly influence an insect host’s nutrition, immune function, development, and reproductive processes [9]. It has been hypothesized that this bacterium might stimulate pheromone production (aggregation/sex) via the terpenoid pathway [10]. Cardinium shows promise for applications in the control of arthropod pest species and arthropod-vectored disease transmission; however, much remains unknown about its physiology and interactions with host organisms. Although Cardinium-associated phenotypes have been investigated in various arthropod hosts, very little is known about the phenotypic effects of Cardinium infection in nematodes [8].
Phylogenetic analyses based on 16S rRNA gene sequences have delineated Cardinium into seven groups [4,7]. This grouping is further supported by analyses of additional loci, including gyrB, sufB, groEL, fusA, and infB [7,11]. Group A is currently the most well characterized and broadly distributed, encompassing strains that infect a wide array of arthropod hosts, including both mandibulate taxa (e.g., insects, crustaceans) and chelicerates (e.g., mites, spiders) [4]. Group B is composed of Cardinium strains associated with plant-parasitic nematodes, such as Heterodera species, Globodera rostochiensis, Cactodera rosae, Punctodera chalcoensis, and Rotylenchus zhongshanensis [4,7,12,13]. Group C appears to be host-restricted, consisting exclusively of Cardinium strains isolated from Culicoides biting midges [14]. Subsequently, representatives of group D were reported in a single marine copepod species, Nitocra spinipes, and presumed to occur in other copepods [15]. Group E contains a Cardinium strain from a single oribatid mite species, Achipteria coleoptrata [16], whereas group F is composed of that from a plant-parasitic root lesion nematode, P. penetrans, only [17]. Group G includes Cardinium strains from non-marine ostracod species and a freshwater mussel, Unio crassus [6,7,18]. Recently, intracellular Cardinium-like symbionts were detected in the wood-inhabiting nematode Bursaphelenchus mucronatus through transmission electron microscopy [19]. These endosymbionts were observed within the nematode’s tissues, suggesting a potential intimate association between the microorganism and its host. However, the molecular characterization and phylogenetic identification of this Cardinium-like bacterium have not yet been performed, leaving its taxonomic position and functional role within B. mucronatus unresolved.
To date, 27 Cardinium genomes are publicly available, representing a growing but still limited genomic dataset for this diverse group of endosymbionts. Of these, seven genomes have been completed, and the remaining genomes are draft assemblies of varying quality and remain incomplete [8]. Two genomes belong to Cardinium group B, one to group F, and the rest to group A. The Cardinium genomes are reduced in size (0.9–1.5 Mbp), with a 33.5–39.0% GC content and a gene count of ~1000 [8,20,21]. For comparison, the closest known relative of Cardinium, the species Candidatus Amoebophilus asiaticus, which is also an obligate symbiont, has a 1.9 Mbp genome, ~1500 genes, and a 35% GC content [22].
The primary aim of this study was to expand the genomic information on Cardinium from plant-parasitic nematodes. To achieve this, we screened both newly generated and publicly available raw genomic sequencing data, successfully recovering ten previously unreported Cardinium genomes. We conducted phylogenetic analyses to determine their evolutionary relationships, assembled genomes of ten Cardinium strains, and performed annotation to characterize their genomic features. Notably, this study reports the first detection of Cardinium in two additional nematode hosts, namely, a stem nematode of the genus Ditylenchus and a sedentary nematode of the genus Meloidoderita, and also molecularly confirms the presence of Cardinium in wood-inhabiting nematodes of the genus Bursaphelenchus. These findings broaden the known host range of Cardinium and contribute to a more complete understanding of the diversity and evolutionary history of this bacterium within nematode symbioses.
2. Results
2.1. New Findings of Cardinium in Plant-Parasitic Nematodes and Their Phylogenetic Relationships
Screening of both original and publicly available Sequence Read Archive (SRA) datasets for the presence of Cardinium 16S rRNA and gyrB gene fragments revealed the occurrence of this endosymbiotic bacterium in genomic assemblies of several nematode taxa. Specifically, Cardinium sequences were detected in datasets corresponding to Amplimerlinius sp., Bursaphelenchus fraudulentus, B. mucronatus, Ditylenchus weischeri, Globodera ellingtonae, and Rotylenchus zhongshanensis (Table A1). In addition, the presence of Cardinium was confirmed in the SRA datasets of Cactodera rosae, Cactodera sp., G. rostochiensis, Heterodera goettingiana, H. latipons, and H. sturhani, as identified in our previous study [6].
The phylogenetic relationships of Cardinium strains based on the 16S rRNA gene sequences obtained from plant-parasitic nematodes and other arthropod hosts are shown in Figure 1. Overall, the relationships among the major Cardinium groups were not well resolved based on the 16S rRNA gene, due to the conserved nature of this marker and limited phylogenetic signal. Intraspecific sequence variation for group B was up to 6.5%, group F—2.7%, and group A—3.1%.
Phylogenetic trees inferred from gyrB gene fragment sequences and from a concatenated dataset comprising eight protein-coding genes (atpD, dnaB, fusA, groEL, gyrB, infB, rpsA, and sufB) (Table A2) are presented in Figure A1. These multilocus phylogenetic analyses provided improved phylogenetic resolution within the Cardinium clade. Notably, Cardinium strains associated with nematodes formed a well-supported monophyletic lineage (groups B and F) in all analyses, except for the tree reconstructed from the nucleotide alignment of the eight-gene concatenation, where the nematode-associated sequences appeared paraphyletic. In single-gene phylogenetic analyses, nucleotide sequences of dnaB, groEL, gyrB, and rpsA recovered nematode-associated Cardinium as a monophyletic lineage, whereas analyses based on atpD, fusA, infB, and sufB indicated paraphyletic relationships. Maximum likelihood comparisons of alternative tree topologies using the Kishino–Hasegawa (KH), Shimodaira–Hasegawa (SH), and Shimodaira approximately unbiased (AU) tests did not reject the hypothesis (p < 0.05) of nematode-associated Cardinium monophyly for the atpD, fusA, and sufB datasets. Only the infB alignment significantly rejected this hypothesis under both SH (p = 0.034) and AU (p = 0.017) tests. Phylogenetic analysis based on amino acid sequences of the polA gene also demonstrated the monophyly of Cardinium strains from plant-parasitic nematodes (Figure A2). The sequence of Cardinium from biting midges occupied a basal position on the trees in all the analyses. The phylogenomic tree inferred from 20 bacterial genomes of the Cardinium clade using JolyTree is given in Figure A3.
The Cardinium 16S rRNA gene sequences and the D2–D3 expansion fragments of the nematode 28S rRNA gene were used to perform a cophylogenetic analysis aimed at exploring potential patterns of host–symbiont association. Although the 16S rRNA gene provided limited phylogenetic resolution among Cardinium strains, the Procrustean Approach to Cophylogeny (PACo) analysis revealed a significant global cophylogenetic signal, indicating that host and symbiont phylogenies are more congruent than expected by chance. The sum of squared residuals (SS) was 0.78, and the permutation test with 1000 replicates produced a p-value of 0, supporting the hypothesis of cophylogeny. However, there were several clear topological incongruences between some bacterial and host lineages (Figure A4). For instance, Cardinium sequences associated with wood-inhabiting nematodes of the genus Bursaphelenchus clustered together with bacterial strains from sedentary plant-parasitic nematodes, despite the fact that these hosts belong to different nematode superfamilies or orders and are only distantly related. Similarly, two morphologically and genetically similar species of Cactodera harbored phylogenetically distinct and unrelated Cardinium strains (Figure A4).
2.2. The Whole and Draft Genomes of Cardinium and Their Annotation
In this study, we successfully assembled ten Cardinium genomes, including two complete (single circular contig without gaps) and eight draft assemblies (Table S1). The summary statistics for these genomes are presented in Table 1. The assemblies ranged from 1 to 144 scaffolds, with total sizes varying between 695,026 and 1,345,476 base pairs. The Cardinium endosymbionts of Bursaphylenchus fraudulentus and B. mucronatus have chromosomes of 695,026 bp and 695,094 bp (Figure 2) in size, respectively, which were assembled in a single circular contig (Table 1 and Table S1). Sequencing depth (genome coverage) for these ten genomes spanned from 19× to 1313×. The GC content of the ten genomes ranged from 35.23% to 41.07%. A total of 600 to 1025 protein-coding genes were predicted per genome, reflecting notable variability in gene content across different strains. There was no evidence of plasmids in the genomes. CheckM analysis showed low levels of contamination (less than 5%) and moderate levels of completeness (approximately 70%). It is worth noting that other Cardinium genomes represented in Genbank have a similar level of completeness. For example, the complete genome of Cardinium hertigii cHgTN10 has a completeness metric of 70% with a contamination level of 1%.
In the Cardinium symbionts of B. mucronatus and B. fraudulentus, a pronounced enrichment in guanine over cytosine and thymine over adenine is observed across one half of the genome, with the opposite pattern in the other half (Figure 2A, inner track). This asymmetric nucleotide composition, known as GC skew, typically changes sign at genomic positions corresponding to the origin and terminus of DNA replication (Figure 2A, green arrows). In contrast, the Cardinium symbiont of Heterodera glycines also exhibits GC skew, although it is less pronounced (Figure 2B, innermost track), possibly reflecting subsequent genomic rearrangements that have altered replication-associated asymmetry. Furthermore, in both genomes, the putative origins of replication are located at positions distant from the dnaA gene, which is set at position zero on the circular genome map. It is also noteworthy that in both Cardinium genomes, the ribosomal RNA genes are not organized into a single operon but instead occur in two separate regions, corresponding to the 16S and 23S–5S rRNA loci (Figure 2, track 3, highlighted in red). As shown in Figure A5, the genomes of Cardinium associated with B. mucronatus and H. glycines exhibit extensive genomic rearrangements, indicating substantial structural divergence between these strains.
The orthologous group (orthogroup) analysis of twenty Cardinium genomes is presented in Figure 3. These genomes share a total of 427 orthogroups (Table S2). The Cardinium genomes from nematodes contain between 21 (cBmuc) and 442 (cDwei) unique strain-specific genes (singletons), of which only 0–9% have an assigned function. All genomes from groups A, B, and C share only one orthogroup encoding a hypothetical protein. In addition, genomes from groups A, C, and F share a siderophore (surfactin) biosynthesis regulatory protein that is absent in the other groups. Genomes from groups B and F each possess three genes encoding unique proteins: a 57-amino acid hypothetical protein, ATP-dependent DNA helicase RecG, and N-acetylmuramoyl-L-alanine amidase. Although group B has no unique orthologues overall, the anhydromuropeptide permease (AmpG), a transporter protein located in the inner membrane of certain Gram-negative bacteria, becomes a unique orthologue for this group once the cMwhi genome is excluded.
Among Cardinium genomes from nematodes of different genera, shared orthologues vary substantially. Strains from Heterodera share 53 orthologues, including 51 hypothetical proteins, a low-specificity L-threonine aldolase, and a glycosyl transferase family 1 protein. Strains from Globodera share 142 orthologues, comprising 140 hypothetical proteins, phenylalanyl-tRNA synthetase β chain, and 1-acyl-sn-glycerol-3-phosphate acyltransferase. In contrast, Bursaphelenchus-associated Cardinium strains share only 144 hypothetical proteins (Figure 3).
The values of average nucleotide identity (ANI) and digital DNA-DNA hybridization (dDDH) based on a genome-to-genome sequence comparison calculated for Cardinium strains from plant-parasitic nematodes are shown in Table A3. Only three strain pairs, namely, cBmuc and cBfra, cGell and cGros, and cHhum and cHstu, are clearly above 95–96% (ANI) and 70% (DDH), used as boundaries for prokaryote species delineation, and, thus, these values indicate co-specificity of strains from these pairs.
Functional annotations of metabolic pathways in twenty Cardinium genomes, including ten new ones and four outgroup bacterial taxa, are given in Figure 4 and Table S3. Although the majority of Cardinium genomes lack biosynthetic pathways for all amino acids and nucleotides and most B vitamins, functional annotation indicates the presence of biosynthetic pathways for two B vitamins, pyridoxine (B6) and biotin (B7), in a small subset of genomes, while the lipoate coenzyme biosynthesis is conserved in all Cardinium genomes from groups A, B, and C and in one out of three group-F genomes. The unique distribution of the B7 biosynthetic pathway is described below, while the B6 biosynthesis genes were identified in two group-A Cardinium genomes. While the biosynthetic pathways for fatty acids and lipopolysaccharides are conserved in all analyzed genomes, the glycerophospholipid pathway is incomplete in eleven Cardinium genomes, with the absence of the gpsA gene encoding glycerol-3-phosphate dehydrogenase, a first pathway step converting glycerate-3-phosphate to glycerol-3-phosphate. We observed a significant reduction in the central carbohydrate metabolism pathways, namely, glycolysis/gluconeogenesis and the tricarboxylic acid (TCA) cycle, and conservation of the pyruvate dehydrogenase (PDH) enzymatic complex in all analyzed Cardinium except two group-F genomes. Finally, all Cardinium genomes lack catabolic or degradation pathways for carbohydrates and amino acids, confirming their metabolic dependence on carbon sources from the host.
The biotin biosynthesis pathway is present in two genomes (cCsp3 and cDwei) and absent in other Cardinium strains from nematodes (Figure 4 and Figure 5). Within Cardinium from nematodes, a complete biotin pathway is found only in cDwei, and this pathway is only partially encoded in cCsp3. The bioY gene coding a membrane protein, which is part of the BioMNY ABC transporter system for biotin, is found in several Cardinium strains (cHstu, cHgTN10, cGros, cGell, cMwhi, and cAsp). In contrast, the biotin protein ligase BirA is present in all genomes, since it is an essential enzyme for the use of this cofactor by biotin-dependent enzymes from the fatty acid biosynthesis pathway (Figure 5). Genes involved in the biotin pathway are not arranged in one operon for cDwei (Figure A6). Phylogenetic analysis of the concatenation of bioA, bioB, bioD, and bioF genes for Cardinium and other bacteria indicated that these genes may undergo independent horizontal transfer events at least two times from distinct bacterial donors to Cardinium symbionts of plant-parasitic nematodes (Figure A7). The phylogeny of the bioB gene also confirms the independent origin of this gene for groups A and F and suggests an ancient origin of this gene for nematodes and its horizontal transfer from a rickettsia-like bacterium (Figure 6).
The glycerophospholipid biosynthesis pathway is also incomplete in groups B and F, where the gpsA gene encoding the first pathway step is absent in most strains, except for cPpe and cHgTN10 (Figure 7). Phylogenetic analysis of gpsA amino acid sequences indicates that this gene was acquired independently in cPpe and cHgTN10 through horizontal gene transfer (HGT), rather than through close evolutionary relationships with the gpsA genes of groups A and C (Figure A8). The majority of core genes involved in glycolysis are also missing in Cardinium strains from nematodes. Phylogenetic analysis of the ppdK gene encoding the gluconeogenic enzyme pyruvate phosphate dikinase suggests that this gene was independently acquired by groups A, B, and F through horizontal transfer from at least three distinct bacterial donors (Figure A9). The aceE, aceF, and lpdA genes encoding the pyruvate dehydrogenase complex (PDH) form the only set of the central carbohydrate metabolism genes that is conserved in all Cardinium genomes from groups A, B, and C and is present in cDwei from group F. The malic MaeB enzyme converting malate to pyruvate is universally conserved in all Cardinium genomes.
Genes associated with DNA replication were found in all Cardinium genomes, except for recN, encoding the DNA repair protein RecN, which is present only in group C. The polA gene, which encodes DNA polymerase I and is a multifunctional enzyme essential for DNA replication and repair in prokaryotes, shows partial loss in several Cardinium strains. Specifically, deletions affecting fragments encoding the 5′→3′ polymerase domain (C-terminal region) and the 3′→5′ exonuclease domain (proofreading, central region) were detected in multiple strains from groups A and B and in one strain from group F (Figure A2). These gene reductions appear to have occurred independently during Cardinium evolution.
3. Discussion
Our findings highlight the remarkable diversity of Cardinium strains associated with plant-parasitic nematodes, expanding the known host range beyond previously reported species [7,12,13,17]. In particular, we report novel Cardinium associations in representatives of the nematode genera Amplimerlinius, Bursaphelenchus, Ditylenchus, and Meloidoderita, underscoring the broad phylogenetic and ecological distribution of this symbiont within plant-parasitic taxa. Plant-parasitic nematodes from these genera exhibit diverse feeding strategies and ecological behaviors, ranging from ecto- and endo-root parasites to wood inhabitants transmitted by insect vectors. A relatively larger number of Cardinium was found in the cyst nematodes (Cactodera, Heterodera, Globodera, and Punctodera), which are sedentary endoparasites that form long-lasting cysts in soil. Their infective juveniles invade plant roots and induce specialized feeding sites called syncytia. These genera are highly specialized and often target specific plants. Other Cardinium strains were presently reported from only one species of other nematode genera. Meloidoderita species are characterized by swollen, cystoid females with a functional stylet, vermiform males often lacking a stylet, and second-stage juveniles that induce syncytia in host roots. In the present classification, Meloidoderita and cyst nematodes belong to different orders or superfamilies. Stem nematodes of the genus Ditylenchus induce characteristic swellings, distortions, and hypertrophy in above-ground plant tissues, particularly stems, leaves, buds, and floral structures. These symptoms result from the nematodes’ feeding within the parenchyma, which disrupts normal cell development and triggers abnormal growth responses. Ditylenchus weischeri, in which Cardinium was found in the present study, is a highly specialized parasite associated with Canada thistle, Cirsium arvense. Amplimerlinius species are ectoparasitic nematodes that feed externally on root surfaces. While they are generally less destructive than endoparasites, they can impair root function, particularly under stress conditions or in high populations. Nematodes of the genus Bursaphelenchus represent a distinct ecological niche among plant-parasitic nematodes. Several species induce wilt diseases in various wood trees and palms. Some representatives of the genus combine plant parasitism with mycophagy (fungus-feeding) and rely on insect vectors. This dual feeding strategy and vector association make Bursaphelenchus ecologically unique in forest ecosystems [23].
Previous phylogenetic analyses based on limited Cardinium gene sequences identified the presence of two major lineages, or groups B and F, associated with plant-parasitic nematodes; however, the evolutionary relationship between these groups remained unresolved [7]. By incorporating a broader set of genes into the analysis, a clearer and more robust phylogenetic framework has emerged. This expanded dataset reveals that the two lineages are likely related and share a common ancestor, suggesting a single evolutionary origin followed by diversification within nematode-associated Cardinium. These findings refine our understanding of Cardinium evolution and suggest a potentially conserved symbiotic co-evolution within nematode hosts.
Symbionts belonging to the Cardinium clade are known to be vertically transmitted through maternal inheritance. However, the presence of closely related Cardinium strains in diverse and distantly related host species suggests that vertical transmission alone does not explain their distribution. This pattern has led to the conclusion that horizontal transmission events also play a role in the spread of these endosymbionts across different host lineages [11,24]. While horizontal transmission of Cardinium among arthropod hosts has been well documented in previous studies, this study represents the first comprehensive analysis of such patterns within plant-parasitic nematode groups. Cophylogenetic analyses of Cardinium strains and their plant-parasitic nematode hosts reveal that host and symbiont lineages have undergone correlated evolution, reflecting some degree of co-diversification. However, patterns of incongruence between host and symbiont phylogenies indicate evidence for several host switching events. These results expand our understanding of Cardinium’s evolutionary dynamics and suggest that cross-species transmission may play a significant role in shaping symbiont–host associations beyond arthropods.
The most widespread group within the Cardinium clade is group A, likely originating in soil mites, where it circulated extensively. From herbivorous mites, the symbiont spread to phloem-feeding insects with piercing–sucking mouthparts, enabling transmission to plants via salivary secretions. Experimental evidence confirms that Cardinium of insects can undergo interspecific horizontal transmission through plant tissue [25,26]. Subsequent evolutionary transitions led to its presence in endoparasitic wasps and predators such as spiders, rove beetles, and opilionids, likely via trophic interactions with infected hosts [7]. While there is no direct evidence that Cardinium is transmitted via plant tissue during nematode feeding, the hypothesis is biologically plausible and should be carefully experimentally tested. Discovering Cardinium in a wide range of plant-parasitic nematodes with diverse feeding strategies and ecological behaviors suggested that plants may serve as ecological reservoirs for Cardinium bacteria, facilitating their persistence in the environment and enabling horizontal transmission to new nematode hosts. This plant-mediated route may not only support the survival of Cardinium outside its primary host but also enhance its potential to colonize novel nematode hosts, contributing to the symbiont’s broad and scattered phylogenetic distribution.
Among the newly assembled Cardinium genomes obtained from plant nematodes in this study, two genomes originating from strains associated with wood-inhabiting Bursaphelenchus species exhibited remarkable genome reduction, with estimated sizes of approximately 695 kb. However, such a pronounced genome size reduction is not associated with extensive orthologous gene loss in these strains. This represents, to our knowledge, the smallest Cardinium genomes described to date. Interestingly, despite their extreme reduction, these genomes are comparable in size to those of certain genera and species of highly specialized endosymbiotic bacteria found in arthropods and nematodes [27,28].
Despite extensive genome reduction, Cardinium genomes retain a conserved core of DNA replication genes, including dnaA, dnaB, dnaG, and others, indicating a functional but streamlined replication apparatus. In contrast, genes involved in DNA repair, recombination, and replication checkpoint control are often incomplete or absent, reflecting increasing reliance on host cellular processes [29]. In several Cardinium strains, the polA gene encoding DNA polymerase I, a multifunctional enzyme that plays a key role in DNA replication and repair in prokaryotes, showed partial loss of domains. Loss of the proofreading exonuclease domain in family A DNA polymerase is well documented, whereas cases in which the polymerase domain itself is absent while exonuclease activity is retained are not well known [30,31].
Functional annotation of Cardinium genomes indicates the absence of or a reduction in several pathways, including the biotin biosynthetic pathway. Biotin, a vitamin belonging to the B-complex group, serves as a vital coenzyme that participates in fatty acid synthesis and other metabolic pathways in all bacteria. A complete biotin biosynthetic pathway, comprising bioA, bioD, bioC, bioH, bioF, and bioB, was identified in Cardinium cEper1 and Cardinium cSfur, endosymbionts that occur, respectively, in wasp Encarsia pergandiella [29] and planthopper Sogatella furcifera [24]. An incomplete biotin biosynthetic pathway, with losses of bioB and nearly all of bioF, was reported in the genome of Cardinium cBtQ1 from silverleaf whitefly, Bemisia tabaci [32]. In this study, a complete biotin synthesis pathway was found in the Cardinium strain from D. weischeri; only a partially encoded pathway was found in the Cardinium strain from Cactodera sp., and it was completely absent in all other newly sequenced Cardinium. The universal conservation of the biotin ligase gene birA, as well as of the fatty acid biosynthesis pathway, confirms that the biotin cofactor is universally indispensable in all Cardinium species and that most of them acquire this vitamin from the host.
The loss or absence of the biotin biosynthetic pathway in the majority of Cardinium strains suggests that endogenous synthesis of this vitamin is neither essential for the bacterium’s own metabolic requirements nor a critical factor in most Cardinium–host symbiotic relationships [8]. However, it is plausible that the biotin biosynthesis capabilities confer a nutritional advantage to their arthropod and nematode hosts. B vitamins, including biotin, are essential micronutrients that play critical roles in metabolic processes and overall host fitness. In such nutritional contexts, the ability of endosymbiotic bacteria to synthesize and supplement B vitamins may be crucial for host survival and reproductive success. Indeed, all Cardinium genomes lack biosynthetic pathways for the majority of B vitamins (with the only two exceptions of vitamins B6 and B7, as discussed above), suggesting the Cardinium species are multi-auxotrophs for nearly all indispensable B vitamins, thus, being totally dependent on their salvage from the host (and if it is a non-plant host, dependent on the host dietary supply).
Although biotin plays essential roles in metabolism, the capacity for de novo synthesis is limited to bacteria, plants, and some fungi, via a conserved four-step pathway beginning with pimeloyl-CoA and ending with biotin [33,34,35,36]. Animals, by contrast, lack this biosynthetic ability and rely on dietary intake [37,38]. Interestingly, biotin-related metabolism is dynamically regulated in cyst and root-knot nematodes and plant-parasitic nematodes, which have reacquired a bacterial-like biotin synthase gene through HGT [38]. While nematodes do not possess the full biosynthetic pathway, they retain the final step—conversion of dethiobiotin to biotin [39]—suggesting it may scavenge this metabolic precursor from the host to fulfill its biotin requirements and highlighting the nutrient’s functional importance in the nematode [38]. It has been well established that several genes involved in biotin biosynthesis, namely, bioA, bioB, and bioD, were horizontally transferred from bacteria into the genome of the whitefly (Bemisia tabaci) [40]. These horizontally acquired genes are believed to play an important role in supplementing the insect’s nutritional requirements, particularly given its nutrient-limited phloem-feeding lifestyle. Our analysis also confirms the presence of a horizontally transferred bioB gene in nematodes [38,41]. This finding suggests that nematodes, like whiteflies, may have independently acquired bacterial biotin-related genes, potentially enhancing their metabolic capabilities or ecological adaptability.
Phylogenetic analysis of the biotin gene cluster in endosymbiotic bacteria does not support previous hypotheses suggesting that these operons were acquired by Wolbachia through horizontal gene transfer from the genus Cardinium [42,43,44]. Instead, the data indicate a possible transfer from Wolbachia and related bacteria to Cardinium, as proposed by Zeng et al. [24]. The lack of phylogenetic congruence suggests that the acquisition of the biotin operon in Cardinium did not occur through a single ancestral event. Rather, it points to at least three independent horizontal transfer events throughout Cardinium’s evolutionary history: (i) endosymbionts of planthoppers, the whitefly Encarsia parasitoid wasp, and the silverleaf whitefly (group A); (ii) an endosymbiont of a stem nematode (group F); and (iii) an endosymbiont of a cyst nematode (group B). These instances of horizontal gene transfer likely originated from phylogenetically distinct bacterial donors.
It has been known that the highly conserved pathways of central metabolism are reduced or entirely absent in Cardinium. Moreover, our genomic searches did not find any catabolic pathway for carbohydrates or amino acids in Cardinium genomes. This indicates that Cardinium depends heavily on host-derived metabolites and pathway intermediates to compensate for its incomplete biosynthetic capabilities [8]. Analysis shows that most Cardinium genomes retain only a partial glycolytic pathway and, in the nematode-associated lineages, it is completely absent, suggesting an even higher degree of host dependence. Gluconeogenesis is similarly incomplete across Cardinium, and most strains lack the capacity to synthesize six-carbon sugars [8]. Glycerophospholipid metabolism appears to be incomplete in groups B and F, where the gpsA gene is missing. The GpsA enzyme normally catalyzes the reduction of glycerone-P (dihydroxyacetone phosphate, DHAP) to glycerol-P, a key precursor for glycerophospholipid biosynthesis. The absence of gpsA therefore disrupts the canonical entry point into the pathway. Interestingly, this pattern mirrors the presence of incomplete glycolysis observed in groups A and C. In these organisms, glycolysis does not proceed through the full set of enzymatic steps, yet glycerone-P remains one of the central intermediates that is still produced. The shared dependence on glycerone-P suggests a functional link between the truncated glycolytic pathway and the glycerophospholipid biosynthesis pathway.
There is strong phylogenetic and comparative genomic evidence that several glycolytic and carbohydrate metabolism enzyme genes have undergone horizontal gene transfer among bacteria. In Cardinium from the planthopper S. furcifera, for example, gpml, enolase, and ppdK were previously identified as HGT-derived genes acquired from distantly related proteobacterial donors [24]. These findings highlight the role of horizontal gene transfer in supplementing or restoring key metabolic functions in intracellular bacteria with reduced genomes. Our analysis further supports this pattern and reveals that ppdK was also horizontally acquired in Cardinium strains associated with nematodes. Notably, the phylogenetic placement of nematode-associated ppdK sequences indicates that these transfer events occurred independently from those reported in insect-associated Cardinium.
4. Materials and Methods
4.1. Nematode Populations, DNA Extraction, and Whole-Genome Amplification
Six nematode DNA samples containing Cardinium detected from our previous project [7] and several new nematode DNA samples were obtained and used in this study (Table A1). DNA from each sample was extracted from several adult and juvenile specimens using the proteinase K protocol. Nematodes were cut under a stereomicroscope. Cut nematodes in 16 μL water suspension were transferred into a 0.2 mL Eppendorf tube, and 3 μL proteinase K (600 μg mL^−1^) (Promega, Madison, WI, USA) with 2 μL 10 × PCR buffer (Taq PCR Core Kit, Qiagen, Germantown, MD, USA) was added to each tube. The tubes were incubated at 65 °C (1 h) and 95 °C (15 min) consecutively. Then, 2 or 3 μL from each sample in two or three replicates was submitted for whole-genome amplification (WGA). WGA was performed using an Illustra GenomiPhi V2 DNA amplification kit (Cytiva, Marlborough, MA, USA) following the manufacturer’s instructions. The replicates belonging to the same sample were mixed and purified using a QIAquick PCR Purification Kit (Qiagen, Germantown, MD, USA). A total of 30 μL containing 60 ng or more of DNA per μL was submitted for genome sequencing.
4.2. Genome Sequencing and Genome Assembly
DNA library construction and sequencing of samples was conducted by Novogene Co., Ltd. (Sacramento, CA, USA), and Dante Labs Inc. (New York, NY, USA). At Novogene, a NEBNext Ultra II DNA library prep kit from Illumina (New England Biolabs, Inc., Ipswich, MA, USA) was used following the manufacturer’s recommendations. Pooled DNA libraries were sequenced on an Illumina NovaSeq 6000 instrument to obtain 150 bp paired-end reads for three samples. At Dante Labs, an MGI library prep kit was used following the manufacturer’s recommendations. Pooled DNA libraries were sequenced on an MGI T7 instrument to obtain 150 bp paired-end reads for the samples. The quality of raw reads was evaluated using FastQC [45]. All genomes except Bursaphelenchus fraudulentus were assembled using SqueezeMeta v1.7.0 [46]. Assembly graphs were then manually checked with Bandage v0.8.1 [47] to remove short, low-coverage sequences not connected to the main graph. The Cardinium of the B. fraudulentus genome was assembled using the Cardinium of the B. mucronatus genome as a reference. In short, raw reads were cleaned using Trimmomatic v0.39 [48] and mapped to the reference genome using Bowtie2 v2.5.4 [49], followed by genome polishing using Polypolish v0.6.0 [50]. The same clean reads were then mapped to the genomic sequence obtained in the previous step using BWA v0.7.19 [51], followed by final polishing using Pilon v1.24 [52]. Genome completeness and contamination metrics were estimated by CheckM v1.2.4 [53]. Orthogroups between the genomes were found using OrthoFinder 3.0.1b1 [54]. The genome relatedness indices, viz., the average nucleotide identity (ANI) and digital DNA-DNA hybridization (dDDH) values, were calculated using JSpecies 1.2.1 [55] and GGDC 3.0 [56] tools, respectively. Genomes were visualized using DNAPlotter v18.2.0 [57]. A circular diagram of orthology between genomes was generated using the mummer2circos program [58].
The 16S rRNA, atpD, dnaB, fusA, groEL, gyrB, infB, rspA, and sufB Cardinium genes were assembled using corresponding reference genes of Cardinium of H. humuli (JAOPFT000000000.1) and H. glycines (CP029619) with Geneious 9.5 [59]. The D2–D3 expansion segments of the 28S rRNA gene for nematodes were also assembled with Geneious using published gene fragments of related species [60].
4.3. Phylogenetic Analysis of Bacterial and Nematode Genes
The new Cardinium gene sequences were aligned using ClustalX 1.83 [61] with their corresponding published gene sequences extracted from published genomes of selected Cardinium from GenBank [7,14,15,24,62,63]. Several alignments were generated: (i) 16S rRNA gene; (ii) gyrB gene; (iii) atpD, dnaB, fusA, groEL, gyrB, infB, rspA, and sufB genes; (iv) bioA, bioB, bioD, and bioF genes; (v) bioB gene; (vi) polA gene; (vii) gpsA gene; and (viii) ppdK gene. Outgroup taxa were chosen based on previously published data [7]. New and published D2–D3 expansion segments of 28S rRNA gene sequences of nematodes were also aligned using ClustalX. The best-fit models of DNA and protein evolution for sequence alignments were obtained using jModeltest [64] and ProtTest3.4.2 [65], respectively, with the Akaike Information Criterion. Nucleotide and amino acid sequence alignments were analyzed with Bayesian inference (BI) using MrBayes 3.1.2 and MrBayes 3.2.7, respectively [66]. BI analysis was initiated with a random starting tree and run with four chains for 1.0 × 10^6^ generations for nucleotide sequence alignment and for 1.0 × 10^4^ generation for amino acid sequence alignment. Two runs were performed for each analysis. The Markov chains were sampled at intervals of 100 generations. After discarding 10% and 25% burn-in samples for nucleotide and amino acid alignments, respectively, 50% majority rule consensus trees were generated. Posterior probabilities in percentage are given on the appropriate clades. The phylogenomic tree was inferred by JolyTree 2.1.211019ac [67]. For testing of alternative topologies, ML trees were reconstructed, and we used the Kishino–Hasegawa, Shimodaira–Hasegawa, and Shimodaira Approximately Unbiased tests as implemented in PAUP. 4.0 [68]. Trees were visualized with the TreeView 1.6.6 program [69] and drawn with Adobe Illustrator v.10. Cophylogenetic analysis was conducted to assess the evolutionary associations between host and symbiont lineages. Phylogenetic trees for the hosts and symbionts were constructed and imported in NEXUS format using the ape package in R. An association matrix was generated to indicate the presence or absence of each host–symbiont link. Pairwise phylogenetic distance matrices were calculated for hosts and symbionts using the cophenetic function. The Procrustean Approach to Cophylogeny (PACo) was implemented via the paco package, with a Cailliez correction applied to handle non-Euclidean distances. The significance of the global cophylogenetic signal was assessed using 1000 permutations, and residuals from the Procrustes analysis were intended to identify host–symbiont pairs contributing most to phylogenetic congruence [70,71].
4.4. Genome Annotation and Metabolic Reconstruction
New Cardinium genomes were initially automatically annotated by RAST [72], while the previously assembled genomes were downloaded from the Bacterial and Viral Bioinformatic Resource Center (BV-BRC) database [73], which also uses RAST for genome annotation. For functional gene annotation and reconstruction of metabolic pathways, we used a subsystems-based approach implemented in the SEED platform [74], which combines homology-based methods with three genome context techniques, namely, clustering of genes on the chromosome (operons), co-regulation of genes (regulons), and co-occurrence of genes in genomes. This approach allowed us to capture alternative biochemical routes (e.g., vitamin biosynthesis pathways implemented by different subsets of enzymes), as well as diverse nutrient transporters. For each metabolic pathway, we established genomic signatures represented by a subset of genes that are required for a complete pathway and determined the pathway completeness index by calculating the ratio of genes present in the genomes to the total number of pathway signature genes. The reference collection of bacterial metabolic pathways in the SEED database that was used for functional annotation of target genomes included biosynthetic pathways for all essential vitamins and cofactors, amino acids and nucleotides, and the central carbohydrate metabolism, as well as catabolic utilization pathways for ~100 carbohydrates, amino acids, and other carbon sources, such as lactate and short-chain fatty acids.
5. Conclusions
The newly assembled Cardinium genomes from diverse plant-parasitic nematodes broaden our understanding of the ecological and phylogenetic scope of this endosymbiont lineage. By clarifying the close evolutionary relationship of Cardinium groups B and F, this study supports a single origin for nematode-associated strains and highlights the extensive metabolic streamlining that characterizes these symbionts. The discovery of highly reduced Cardinium genomes in wood-associated Bursaphelenchus species, together with the patchy distribution of biotin biosynthesis and evidence for horizontal acquisition of key metabolic genes, underscores the dynamic evolutionary pressures shaping Cardinium–nematode associations. Future efforts integrating more complete genome assemblies with functional analyses of host–symbiont interactions will be critical for elucidating how metabolic dependency, genome reduction, and horizontal gene transfer jointly drive the evolution and ecological diversification of Cardinium across nematode hosts.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Zchori-Fein E. Perlman S.J. Distribution of the bacterial symbiont Cardinium in arthropods Mol. Ecol.2004132009201610.1111/j.1365-294X.2004.02203.x 15189221 · doi ↗ · pubmed ↗
- 2Zchori-Fein E. Perlman S.J. Kelly S.E. Katzir N. Hunter M.S. Characterization of a ‘Bacteroidetes’ symbiont in Encarsia wasps (Hymenoptera: Aphelinidae): Proposal of ‘Candidatus Cardinium hertigii’Int. J. Syst. Evol. Microbiol.20045496196810.1099/ijs.0.02957-015143050 · doi ↗ · pubmed ↗
- 3Duron O. Hurst G.D. Hornett E.A. Josling J.A. Engelstädter J. High incidence of the maternally inherited bacterium Cardinium in spiders Mol. Ecol.2008171427143710.1111/j.1365-294X.2008.03689.x 18266629 · doi ↗ · pubmed ↗
- 4Nakamura Y. Kawai S. Yukuhiro F. Ito S. Gotoh T. Kisimoto R. Yanase T. Matsumoto Y. Kageyama D. Noda H. Prevalence of Cardinium bacteria in planthoppers and spider mites and taxonomic revision of “Candidatus Cardinium hertigii” based on detection of a new Cardinium group from biting midges Appl. Environ. Microbiol.2009756757676310.1128/AEM.01583-0919734338 PMC 2772453 · doi ↗ · pubmed ↗
- 5Weinert L.A. Araujo-Jnr E.V. Ahmed M.Z. Welch J.J. The incidence of bacterial endosymbionts in terrestrial arthropods Proc. Biol. Sci.20152822015024910.1098/rspb.2015.024925904667 PMC 4424649 · doi ↗ · pubmed ↗
- 6Schön I. Kamiya T. Van den Berghe T. Van den Broecke L. Martens K. Novel Cardinium strains in non-marine ostracod (Crustacea) hosts from natural populations Mol. Phylogenet. Evol.201913040641510.1016/j.ympev.2018.09.00830244151 · doi ↗ · pubmed ↗
- 7Tarlachkov S.V. Efeykin B.D. Castillo P. Evtushenko L.I. Subbotin S.A. Distribution of bacterial endosymbionts of the Cardinium clade in plant-parasitic nematodes Int. J. Mol. Sci.202324290510.3390/ijms 2403290536769231 PMC 9918034 · doi ↗ · pubmed ↗
- 8Mathieson O.L. Schultz D.L. Hunter M.S. Kleiner M. Schmitz-Esser S. Doremus M.R. The ecology, evolution, and physiology of Cardinium: A widespread heritable endosymbiont of invertebrates FEMS Microbiol. Rev.202549 fuaf 03110.1093/femsre/fuaf 03140705355 PMC 12342168 · doi ↗ · pubmed ↗
