Evidence for Paternal Mitochondrial DNA Leakage in Diploid Hybrid Fish Lineages
Yalan Zhang, Qinglin Xu, Wei Chen, Sijin Fan, Yu Hu, Xinyue Deng, Gaode Zhong, Kaikun Luo, Mingli Chai, Huan Zhong, Wuhui Li, Fangzhou Hu, Shi Wang, Shaojun Liu

TL;DR
This study shows that distant hybridization in fish leads to unstable mitochondrial DNA, with random paternal leakage followed by selective elimination to stabilize hybrid offspring.
Contribution
The study reveals paternal mitochondrial DNA leakage and selective elimination in hybrid fish lineages, offering new insights into hybrid adaptability and mitochondrial evolution.
Findings
Paternal mtDNA leakage occurs randomly in first-generation hybrid fish.
Selective pressure eliminates incompatible mtDNA variations in hybrid lineages.
Subsequent self-crossing reduces structural variation in mitochondrial genomes.
Abstract
Distant hybridization can induce rapid changes in the genotype and phenotype of offspring. Compared to the nuclear genome, the mitochondrial genome often possesses greater potential for cross-species introgression. This distinction facilitates the more stable transmission of mitochondrial DNA (mtDNA) in hybrid offspring, directly regulates their adaptability, and provides a key driving force for species evolution. Here, through an analysis of complete mitochondrial genome genetic variation in offspring from distant hybridization between female common carp and male blunt snout bream, we reveal that the mitochondrial genomes in first-generation hybrid species are unstable under the impact of distant hybridization shock, exhibiting paternal mtDNA leakage with randomness. However, to establish viable offspring under the impact of distant hybridization shock, strong selective pressure…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3- —Yuelushan Laboratory Breeding Program
- —National Key R&D Program of China
- —National Natural Science Foundation of China
- —Youth Science and Technology Talents Lifting Project of Hunan Province
- —Earmarked Fund for the China Agriculture Research System
- —111 Project
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic diversity and population structure · Evolution and Genetic Dynamics · Mitochondrial Function and Pathology
1. Introduction
Hybridization is a prevalent genetic mechanism in nature, having occurred in at least 25% of plants and 10% of animals, especially among evolutionarily recent lineages [1]. This process serves as a significant source of genetic variation and can profoundly influence evolutionary trajectories through mechanisms such as gene recombination, gene introgression, mutation, and reproductive isolation [2]. Hybridization can also contribute to adaptive differentiation and the emergence of new lineages that incorporate ancestral genetic material [3]. Therefore, research into the mechanisms of hybridization contributes to deepening our understanding of species origins and accelerating the unveiling of the mysteries surrounding the underlying mechanisms of speciation. Furthermore, hybridization techniques have been widely applied in genetic improvement and aquaculture programs, aiming to cultivate a series of superior varieties characterized by fast growth, strong stress tolerance, high disease resistance, and good meat quality [4,5,6,7]. Initially, research on hybridization was focused primarily on crosses between phylogenetically close species, whose hybrids tend to exhibit greater viability and fertility. However, with a deeper understanding of hybridization techniques, an increasing number of studies have demonstrated that distant hybridization can overcome reproductive barriers, facilitating gene transfer between species and promoting rapid changes in the genotype and phenotype of offspring. Consequently, it has now gained significant favor among scholars in the fields of genetic breeding and evolution. At the genomic level, this process can lead to alterations in DNA ploidy and structural variation, thereby expanding the adaptive and evolutionary potential of hybrid offspring [8,9,10].
Mitochondria are semi-autonomous organelles that possess their own genetic system. This independence enables them to carry out essential functions such as genome replication, transcription, and protein synthesis. Mitochondrial DNA (mtDNA) has been widely used as a marker in phylogenetic and population genetics studies due to its predominantly maternal inheritance, compact structure, and relatively high evolutionary rate. It serves to elucidate the evolutionary origins and relationships among species [11]. Furthermore, as a complete genome, mtDNA facilitates the elucidation of genome evolutionary mechanisms within a phylogenetic framework. Differences in its structural features, such as genome size, gene content, gene order, and non-coding sequences, reflect variations in functional and evolutionary constraints. These features can both trace the evolutionary trends of ancient species and be used to identify functional limitations that lead to structural variation [12]. The strict maternal inheritance of mitochondria observed in most species has been associated with reducing genetic variation among intracellular organelles, suppressing the transmission of deleterious mutations, and enhancing the efficiency of purifying selection [13,14,15,16,17,18]. Although paternal mitochondrial leakage is rare, its occurrence, even at low frequencies, can affect estimates of genetic diversity, gene flow, and phylogenetic reconstruction [19,20,21,22,23]. Furthermore, evidence indicates that mitochondrial recombination can occur following paternal leakage, although the underlying mechanisms remain unclear [16]. Documented cases include triploid crucian carp [24] and offspring resulting from hybridization between black spruce (Picea mariana) and red spruce (Picea rubens Sarg) [25]. These findings suggest that mtDNA dynamics in hybrids may be more complex than previously assumed. Distant hybridization can facilitate mitochondrial introgression between species, alter nuclear–mitochondrial interactions, and may consequently influence the adaptability and evolutionary trajectory of hybrid lineages [26,27,28]. Recent studies have demonstrated paternal mitochondrial leakage and recombination in fish hybrids, further underscoring the necessity of investigating mtDNA structural variation in the context of hybridization [24,29].
As the experimental subjects of this study, the improved diploid carp (IDC) and improved diploid scattered mirror carp (IDMC) lineages were derived from distant hybridization between female common carp and male blunt snout bream. These two lineages exhibit distinguishable phenotypic differences. Notably, within the self-cross offspring of IDC-F_1_, the population differentiated into two distinct phenotypic types: one resembling the common carp phenotype, and the other more similar to the improved diploid scattered mirror carp phenotype. This phenomenon of highly similar genotypes but differentiated or even polymorphic phenotypes has sparked our great interest. Previous evidence has indicated structural differences in nuclear genes (Hox gene family), suggesting that mechanisms such as genomic recombination, duplications/deletions, alterations in gene expression, transposon activation, and epigenetic effects collectively contributed to the observed phenotypic differentiation in IDC-F_2_ [30]. However, it remains unclear whether distant hybridization also induces structural variation in the mitochondrial genome and whether such alterations affect the synergistic interaction between the nuclear and mitochondrial genomes. This study aims to investigate the impact of distant hybridization on mtDNA structural variation in the IDC lineage (F_1_, F_2_) and the IDMC lineage (F_1_, F_2_) through analysis of the complete mitochondrial genome. We seek to characterize patterns of genetic variation, identify evidence of paternal mitochondrial leakage, and evaluate the evolutionary dynamics of mtDNA across hybrid generations. These findings provide novel insights into mitochondrial inheritance in vertebrates and contribute to the understanding of the molecular mechanisms that regulate nucleus-mitochondria interaction in hybrid populations.
2. Materials and Methods
2.1. Experimental Fish
Experimental fish included common carp (abbreviated as COC, 2n = 100), blunt snout bream (abbreviated as BSB, 2n = 48), red crucian carp (abbreviated as RCC, 2n = 100), the improved diploid carp (abbreviated as IDC, 2n = 100), and the improved diploid scattered mirror carp (abbreviated as IDMC, 2n = 100). All specimens were sourced from the Engineering Research Center of Polyploid Fish Reproduction and Breeding of the State Education Ministry, Hunan Normal University, Changsha, Hunan Province, China.
In previous research, our laboratory conducted distant hybridization between common carp (Cyprinus carpio, 2n = 100) (♀) and blunt snout bream (Megalobrama amblycephala, 2n = 48) (♂), yielding fertile improved diploid carp (IDC-F_1_) and improved diploid scattered mirror carp (IDMC-F_1_) in the F_1_ generation. These two improved types exhibited a degree of phenotypic difference but possessed relatively similar genotypes. The two improved types were self-crossed separately, resulting in two sub-generation lineages: IDC-F_2_ and IDMC-F_2_. Observation of the morphological characteristics of IDC-F_2_ revealed that this population differentiated into two distinct phenotypic types. One type of offspring exhibited a phenotype similar to common carp (abbreviated as IDC-F_2_-C), while the other showed a phenotype more akin to the improved diploid scattered mirror carp (abbreviated as IDC-F_2_-M). In contrast, the phenotypic characteristics of IDMC-F_2_ showed no difference from those of its parental IDMC-F_1_ [30].
2.2. DNA Extraction, PCR Amplification, Cloning, and Sequencing
In this experiment, blood was collected from BSB, COC, IDC-F_1_, IDMC-F_1_, IDC-F_2_-C, IDC-F_2_-M, IDMC-F_2_, and RCC using standard methods. The blood was diluted with pre-cooled PBS and centrifuged at 600× g, 4 °C. The supernatant was removed, and the pellet was resuspended in 1 mL of pre-cooled mitochondrial isolation buffer (250 mM sucrose, 10 mM Tris-HCl, pH 7.4, 1 mM EDTA). The sample was homogenized using a TGrinder H24R tissue homogenizer at low temperature. The homogenate was centrifuged at 1000× g, 4 °C for 10 min. The supernatant was transferred to a new tube and centrifuged at 12,000× g, 4 °C for 15 min. The pellet was resuspended in 500 μL of DNase I reaction buffer (10 U/mL DNase I, 50 mM Tris-HCl, pH 7.6, 10 mM MgCl_2_) and incubated at 37 °C for 10 min. It was then centrifuged at 12,000× g, 4 °C for 10 min. The pellet was washed twice with pre-cooled PBS to remove residual DNase I. Genomic DNA was then extracted from the aforementioned mitochondrial-enriched and DNase I-treated pellet using the TaKaRa MiniBEST Universal Genomic DNA Extraction Kit Ver.5.0 (9765), produced by Takara Biomedical Co., Ltd. (TaKaRa, Beijing, China). The extracted DNA templates were subsequently assessed using gel electrophoresis, and their OD values were measured with a microplate reader for use in subsequent experiments. The extracted DNA templates were subsequently assessed using gel electrophoresis, and their OD values were measured with a microplate reader for use in subsequent experiments.
The extracted DNA was subjected to PCR amplification, followed by product detection and purification. To obtain the complete mitochondrial sequence, up to 22 contiguous and overlapping fragments were amplified in the experimental fish using highly conserved PCR primers [29]. The PCR primer details are provided in Supplementary Table S1. The total volume for the PCR reaction was 25 μL, containing 1 μL DNA (approximately 10 ng), 0.8 μL each of forward and reverse primers (10 μM), 12.5 μL LA mix, and 9.9 μL sterile water. The reaction protocol was as follows: initial denaturation at 95 °C for 5 min; followed by 32 cycles of denaturation at 95 °C for 30 s, annealing at 52–57 °C for 30 s, and extension at 72 °C for 1–3 min; with a final extension at 72 °C for 8 min. The PCR products were separated by electrophoresis on a 1.2% agarose gel containing ethidium bromide (EB), using a voltage of 10 V/cm for 15–20 min. The product size and purity were determined using a gel imaging system. If the imaging system showed a single band per replicate lane and the fragment size was as expected, the PCR products were directly sent for sequencing. If non-specific bands were present, the amplification temperature was re-optimized. After further verification, the target band was excised from the agarose gel, purified using a Shanghai Sangon gel extraction kit, and then submitted for sequencing. For some complex PCR products, the fragments were cloned into the pMD18-T vector (TaKaRa, Dalian, China), and the plasmids were transformed into E. coli DH5α cells and purified. For each PCR product, at least three clones were sequenced on an ABI 3730XL automated sequencer (ABI PRISM 3730, Applied Biosystems, Foster City, CA, USA) using the primer walking method with vector-specific primers. For directly sequenced fragments, forward and reverse sequences were required to be consistent. For complex regions, confirmation was obtained through sequencing multiple clones (≥3 clones) combined with bidirectional sequencing to rule out PCR or sequencing errors.
2.3. Analysis of the Structure and Composition of the Complete Mitochondrial Genome Sequence
The verified fragment sequences mentioned above were used for the subsequent assembly and variation analysis of the whole mitochondrial genome. Using the Clustal W sequence alignment mode in Bioedit [31], each gene fragment was aligned against the existing common carp mitochondrial genome sequence (KF856965.1) from the GenBank database at NCBI (https://www.ncbi.nlm.nih.gov/, accessed on 6 August 2025) to verify if they were the target fragments and to preliminarily determine the start and stop positions of each gene within the full sequence. Based on the gene positions and after manual verification, the sequences were assembled using DNA-MAN [32]. The output result was the fully assembled mitochondrial genome sequence. The sequence was then aligned against the mitochondrial genome sequence of the original maternal parent common carp (KF856965.1) using BioEdit. Redundant bridging sequences, introduced during assembly due to insufficient overlap, were removed to obtain the complete mitochondrial genome sequence for the respective fish. The complete mitochondrial genome sequences of common carp (KF856965.1), blunt snout bream (NC_010341.1), and crucian carp (GU086395.1) were obtained from the GenBank database to serve as reference controls for this study. Preliminary annotation was performed using the MITOS Webserver (http://mitos.bioinf.uni-leipzig.de/index.py, accessed on 19 August 2025) to confirm gene positions and classify genes into heavy and light strands. BLAST (http://www.ncbi.nlm.nih.gov, accessed on 6 August 2025) was utilized to align the sequences against other published mitochondrial genomes of cyprinid fish, facilitating the identification of protein-coding genes, the non-coding control region, ribosomal RNA genes, and transfer RNA genes.
2.4. Analysis of Protein-Coding Gene (PCG) Sequence Structure
MEGA 11 [33] was used to analyze the nucleotide composition (percentages of A, T, C, G, A + T, and G + C) and amino acid usage frequencies of the 13 protein-coding genes (PCGs) in the mitochondrial genome. To identify mutation sites, the mitochondrial whole-genome sequences of each hybrid individual were aligned base-by-base with their respective maternal parent to identify all single nucleotide variant (SNV) sites. These sites were then categorized into the first, second, or third codon positions based on annotation information. The analysis focused on the base substitution patterns at the third codon position, and the mutation rate (number of mutations/total aligned bases, expressed as a percentage) was calculated.
2.5. Analysis of Mitochondrial Genome Genetic Variation
Comparative analysis of the complete mitochondrial genome sequences across different generations of the IDC and IDMC lineages was conducted. Sequencing was focused on ten mitochondrial genes (structural regions), including the control region (CR), which is the most variable region of the mitochondrial sequence [34]; the 12S rRNA and 16S rRNA genes, which are the slowest-evolving genes in the mitochondrial sequence; and the COI, COII, COIII, NADH2 (ND2), NADH3 (ND3), NADH4 (ND4), and NADH5 (ND5) genes, which are among the fastest-evolving genes in the mitochondrial sequence. Using BioEdit and MEGA 11, polymorphism analysis was performed on the sequences of the aforementioned ten mitochondrial genes (structural regions) across the studied species to investigate the genetic variation between different generations within the IDC and IDMC lineages, as well as their variations compared to the parental sequence structures.
3. Results
3.1. Basic Characteristics of Mitochondrial Genome Sequence Structure and Composition
As shown in Table S2, among the mitochondrial genome sequences of various species, the size of the mitochondrial genes (structural regions) showed little difference among species, except for BSB. IDC-F_1_ and IDMC-F_1_ were completely identical in the size of all genes (structural regions). They differed from the maternal parent COC by only 1–2 bp in two genes (structural regions): COII and tRNA-Cys. Their differences from IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M, which were nearly identical to each other in the size of all mitochondrial genes/structural regions) were limited to 1–2 bp in the 12S rRNA, tRNA-Cys, and COI genes, while the differences from RCC and BSB were somewhat larger. IDMC-F_1_ differed from the maternal parent COC by only 1–2 bp in tRNA-Cys and COII, and from IDMC-F_2_ by 1–2 bp in 12S rRNA, tRNA-Cys, and COI. Its difference from RCC was somewhat larger, and the greatest difference was observed with BSB. Differences in size between IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M) and IDMC-F_2_ were found only in the 12S rRNA and 16S rRNA genes. Specifically, IDC-F_2_-C and IDC-F_2_-M differed by only 1 bp in the 12S rRNA gene, and IDC-F_2_-C differed from IDMC-F_2_ by only 1 bp in the 16S rRNA gene. Regarding the sequence intervals of mitochondrial genes (structural regions), the IDC and IDMC lineages were completely consistent. They differed from COC only in the interval numbers of the tRNA-Asp and COI genes, while showing somewhat larger differences in mitochondrial genes (structural regions) sequence intervals compared to the paternal parent BSB and RCC.
3.2. Analysis of Nucleotide Sequences and Codon Composition of Protein-Coding Genes
Regarding codon usage in the 13 protein-coding gene sequences, the start codons for all protein-coding genes were consistent across all species. In terms of stop codons, IDC-F_1_ differed from the maternal parent COC and the offspring IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M) only in the ND6 gene. It differed from the paternal parent BSB in three genes: ATPase8, ND4, and ND6, but was consistent with RCC in the stop codons of all genes. The stop codons for all genes were consistent between IDC-F_2_-C and IDC-F_2_-M, and were also consistent with those of the original maternal parent COC, but differed from those of BSB and RCC. IDMC-F_1_ shared consistent stop codons with the maternal parent COC and IDMC-F_2_, differed from the paternal parent BSB in the ATPase8 and ND4 genes, and differed from RCC only in the ND6 gene. The stop codons for all genes were consistent between IDMC-F_2_ and IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M). In the nucleotide composition at each codon position of the 13 protein-coding genes (Table S3), compared to the parental COC and BSB, the IDC and IDMC lineages exhibited a relatively larger fluctuation range (0.05–3.15%) for the corresponding nucleotides at the third codon position, relative to those at the first and second positions. This section of the study conducted a focused analysis on the third codon position in the IDC and IDMC lineages. These analytical results showed that in the mitochondrial protein-coding gene sequences of IDC-F_1_, 761 base sites had mutated relative to the maternal parent COC, with the majority of these mutations (83.00%) occurring at the third codon position. The mitochondrial protein-coding gene sequences of IDC-F_2_-C1 had 581 base sites mutated relative to the maternal IDC-F_1_, with the majority of these mutations (79.70%) occurring at the third codon position. The mitochondrial protein-coding gene sequences of IDC-F_2_-M1 had 574 base sites mutated relative to the maternal IDC-F_1_, with the majority of these mutations (80.31%) occurring at the third codon position. The mitochondrial protein-coding gene sequences of IDMC-F_1_ had 581 base sites mutated relative to the maternal parent COC, with the majority of these mutations (79.70%) occurring at the third codon position. The mitochondrial protein-coding gene sequences of IDMC-F_2_ had 103 base sites mutated relative to the maternal IDMC-F_1,_ with the majority of these mutations (71.84%) occurring at the third codon position (Table 1). Table 1 also shows the types and proportions of mutations at the third codon position in the mitochondrial protein-coding gene sequences for several fish species, using their respective maternal parents as the reference sequences. In IDC-F_1_, synonymous mutations at the third codon position accounted for 95.25%, while non-synonymous mutations constituted only 4.73%. In IDC-F_2_-C1, synonymous mutations at the third codon position accounted for 95.03%, with non-synonymous mutations making up merely 4.97%. In IDC-F_2_-M1, synonymous mutations at the third codon position accounted for 94.79%, while non-synonymous mutations constituted only 5.21%. In IDMC-F_1_, synonymous mutations at the third codon position accounted for 93.70%, with non-synonymous mutations making up 6.30%. In IDMC-F_2_, synonymous mutations at the third codon position accounted for 90.54%, and non-synonymous mutations accounted for 9.46%. Although within the F_1_-F_2_ generations of either the IDC or IDMC lineages, IDMC had fewer mutated bases at the third codon position compared to IDC, and there was no significant difference in the number of non-synonymous mutations between IDC and IDMC at the F_1_ stage, the proportion of non-synonymous mutations was relatively larger in IDMC compared to IDC. Although IDMC-F_2_ had the lowest total number of mutations, it displayed the highest rate of non-synonymous mutations. Furthermore, among the mutation types at the third codon position, the transitions from base C to T or from T to C were the most frequent (18.57–33.26%) in both the IDC and IDMC lineages.
We conducted statistics on the amino acid usage frequency in the mitochondrial protein-coding gene sequences across the studied species (Table 2). It was found that the amino acid usage patterns were similar between the IDC and IDMC lineages, with minimal differences. The usage frequencies were most similar between IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M) and IDMC-F_2_. Not only did the usage frequency for each amino acid differ by no more than 0.06%, but the frequencies for multiple amino acids, such as Ala, Glu, and Phe, were completely identical. A similar phenomenon was also observed between IDC-F_1_ and IDMC-F_1_. Comparing the IDC and IDMC lineages with COC, BSB, and RCC revealed that the usage frequencies of most amino acids did not differ greatly. However, significant differences were found in the usage frequencies of the three amino acids Ile, Met, and Trp between the IDC and IDMC lineages compared to COC, BSB, and RCC. As clearly shown in Table 2, the usage frequency of Ile sharply decreased in both the IDC and IDMC lineages, while the usage frequencies of Met and Trp sharply increased in these lineages. Furthermore, these trends were largely consistent across both the F_1_ and F_2_ generations. Combined with the data from Table 1, among the non-synonymous base substitutions in the mitochondrial protein-coding gene sequences of the IDC and IDMC lineages, the A-to-G substitution had the highest proportion (13.89–25.00%).
3.3. Analysis of Genetic Variation in Mitochondrial Genome Sequences
As indicated by prior analyses, polymorphic individuals were identified across multiple generations of the IDC and IDMC lineages. We analyzed the genetic variation in mitochondrial genome structure across these generations, focusing on ten specific regions: the CR (control region), 12S rRNA, 16S rRNA, COI, COII, COIII, ND2, ND3, ND4, and ND5. As shown in Figure 1 and Table 3, varying degrees of genetic variation were present in the mitochondrial genome structures across different generations of the IDC and IDMC lineages. In the first generation (F_1_) of both lineages, the majority of base sites were conserved (75.89–93.08%), while a small portion was identical to those of the maternal parent COC (0.00–21.13%). Interestingly, we discovered paternal leakage in the mitochondrial genomes of both IDC-F_1_ and IDMC-F_1_. In IDC-F_1_, 0.43% of the base sites in the CR, 0.10–0.12% of the base sites in the two RNAs, and 0.00–7.80% of the base sites in the seven protein-coding genes matched the paternal parent BSB. In IDMC-F_1_, 0.43% of the base sites in the CR, 0.10–0.12% of the base sites in the two RNAs, and 0.00–13.39% of the base sites in the seven protein-coding genes matched the paternal parent BSB. Overall, IDC-F_1_ mitochondrial genomes had an average of 190.25 paternal base insertions (the average number of paternal embedded bases among different polymorphic individuals), slightly higher than the average of 152.25 in IDMC-F_1_. Furthermore, in both IDC-F_1_ and IDMC-F_1_, the number and distribution of paternal leakage within the same mitochondrial gene (structural region) varied significantly across different polymorphic individuals. For the COI gene in IDC-F_1_, the proportion of sites with paternal base insertions was 7.80% in IDC-F_1_-3, but only 0.06% in IDC-F_1_-4. For the ND4 gene, the proportion was 7.31% in IDC-F_1_-1, compared to only 0.22% in IDC-F_1_-4. For the ND3 gene in IDMC-F_1_, the proportion of sites with paternal base insertions was 13.39% in IDMC-F_1_-3, but 0.00% in IDMC-F_1_-4; for the COI gene, the proportion was 4.64% in IDMC-F_1_-1, compared to 0.06% in IDMC-F_1_-4. Furthermore, we found that a portion of the paternal base insertions present in IDC-F_1_ and IDMC-F_1_ were stably transmitted into the IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M) and IDMC-F_2_. In IDC-F_1_, all paternally inserted base sites in the CR, 12S rRNA, and 16S rRNA genes, along with 0.00% to 100.00% of such sites in the seven protein-coding genes, were stably transmitted to IDC-F_2_. The paternally inserted base sites stably transmitted to IDC-F_2_-C and IDC-F_2_-M were largely identical. In IDMC-F_1_, all paternally inserted base sites in the CR, 12S rRNA, and 16S rRNA genes, and 59.72% to 100.00% of those in the seven protein-coding genes, were stably transmitted to IDMC-F_2_ (Table 4). In the ND4 and ND5 genes of IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M) and IDMC-F_2_, although the proportions of stably inherited paternal base insertions varied significantly, the actual numbers of these insertions were largely consistent (Table 4).
In both the IDC and IDMC lineages, mutations occurred at some base sites due to the insertion of paternal bases into the mitochondrial genome. In IDC-F_1_, 0.43% to 0.54% of the base sites in the CR mutated, the mutation rates in the two rRNAs ranged from 0.00% to 0.24%, and the mutation rates in the seven protein-coding genes ranged from 0.00% to 7.68%. In IDMC-F_1_, 0.43% to 0.65% of the base sites in the CR mutated, the mutation rates in the two rRNAs ranged from 0.00% to 0.30%, and the mutation rates in the seven protein-coding genes ranged from 0.00% to 7.22%. Interestingly, the majority of these mutation sites were consistent with those in RCC. Overall, the mitochondrial genome of IDC-F_1_ had an average of 221 mutation base sites (the average number of mutation base sites among different polymorphic individuals), which was slightly higher than the 155 in IDMC-F_1_. Similar to the paternal leakage observed in mitochondrial genomes, the extent of base mutations within the same mitochondrial gene (structural region) varied significantly among different polymorphic individuals in both IDC-F_1_ and IDMC-F_1_. In IDC-F_1_, for the COI, the proportion of base mutations was 7.29% in IDC-F_1_-1, but only 0.26% in IDC-F_1_-4. For the ND4, the proportion of base mutations was 7.68% in IDC-F_1_-1, compared to only 0.36% in IDC-F_1_-4. In IDMC-F_1_, for the COI, the proportion of base mutations was 7.22% in IDMC-F_1_-1 and 0.26% in IDMC-F_1_-4. For the ND4, the proportion of base mutations was 5.00% in IDMC-F_1_-2 and 0.29% in IDMC-F_1_-1. Furthermore, we found that a portion of the mutation base sites present in IDC-F_1_ and IDMC-F_1_ were stably transmitted into the IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M) and IDMC-F_2_. In IDC-F_1_, all mutation base sites in the CR and 16S rRNA, and 1.89% to 100.00% of the mutation base sites in the seven protein-coding genes, were stably transmitted to IDC-F_2_, with those inherited into IDC-F_2_-C being consistent with the paternal embedded base sites in IDC-F_2_-M. In IDMC-F_1_, all mutation base sites in the CR, 66.67% of those in the 16S rRNA, and 53.57% to 100.00% of the mutation base sites in the seven protein-coding genes were stably transmitted to IDMC-F_2_. In the 16S rRNA, ND4, and ND5 genes of IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M) and IDMC-F_2_, although the proportions of stably inherited mutation base sites differed significantly, the number of these stably inherited mutation base sites was largely consistent.
As shown in Figure 2 and Table 5, the sequences of the ten mitochondrial genes (structural regions) in the IDC and IDMC lineages, derived from the distant hybridization between COC (♀) and BSB (♂), exhibited polymorphism, with different individuals within the same population displaying multiple genetic variation patterns. At the same base sites, some were inherited from the maternal parent COC, while others were inherited from the paternal parent BSB or had undergone mutation. In IDC-F_1_, the polymorphic sites in the CR accounted for 0.22%, in the two RNAs for 0.18–0.21%, and in the seven protein-coding genes for 1.27–15.34%. In IDC-F_2_-C, the two rRNAs no longer exhibited polymorphic sites, while the CR had 0.11% polymorphic sites, and the seven protein-coding genes had 0.09–0.64% polymorphic sites. In IDC-F_2_-M, the two rRNAs no longer exhibited polymorphic sites, while the CR had 0.11% polymorphic sites, and the seven protein-coding genes had 0.13–0.72% polymorphic sites. In IDMC-F_1_, the polymorphic sites in the CR accounted for 0.22%, in the two RNAs for 0.10–0.42%, and in the seven protein-coding genes for 0.10–14.25%. In IDMC-F_2_, the two rRNAs no longer exhibited polymorphic sites, while the CR had 0.22% polymorphic sites, and the seven protein-coding genes had 0.28–0.52% polymorphic sites. In the F_2_ generations of both improved carp lineages, the number of polymorphic sites in IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M) and IDMC-F_2_ was drastically reduced compared to their respective parental lines. The genetic variation patterns exhibited by different individuals within IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M) and IDMC-F_2_ were also diminished. As shown in Figure 3, aside from having a relatively high number of polymorphic sites within the mitochondrial genes (structural regions), the distribution patterns of these polymorphic sites across the mitochondrial genomes of IDC-F_1_ and IDMC-F_1_ were not consistent, exhibiting a notable degree of dissimilarity. Furthermore, within their respective self-bred generations, the distribution of polymorphic base sites also varied between IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M) and IDMC-F_2_.
4. Discussion
Hybridization is a major source of evolutionary innovation [35], promoting speciation through the generation of genetic and phenotypic variation, which may precede adaptive radiation [36,37]. Hybrids are generally more stable across varying environments compared to inbred lines [38]. They exhibit heterosis, capable of producing phenotypes superior to those of the parental lines [39]. To investigate the impact of distant hybridization on mitochondrial DNA (mtDNA) structure variation, this study analyzed genetic variation in the mitochondrial genomes of the improved diploid carp (IDC) and improved diploid scattered mirror carp (IDMC) lineages derived from distant hybridization between female common carp and male blunt snout bream. Analysis of the complete mitochondrial genome sequence structure and composition of these two improved carp lineages and their parents revealed differences in the basic mitochondrial genome structure across generations within both the IDC and IDMC lineages. Interestingly, despite distinct phenotypic differences, IDC-F_2_-C and IDMC-F_2_ exhibited relatively small differences in their mitochondrial genome structures. Furthermore, both the IDC and IDMC lineages showed greater similarity to the original maternal parent COC, followed by RCC, and exhibited the greatest difference from the original paternal parent BSB.
Subsequent analysis of the protein-coding gene sequence structures revealed that, compared to the parental lines, the corresponding bases at the third codon position in the amino acid sequences of mitochondrial protein-coding genes exhibited a relatively large fluctuation range. The number of mutations at this site was significantly higher in the IDC lineage than in the IDMC lineage. Furthermore, compared to the F_1_ generation, both lineages exhibited a sharp decrease in mutation number in the F_2_ generation. Moreover, within IDC-F_2_, IDC-F_2_-C1, and IDC-F_2_-M1 exhibited a similar number of mutations at the third codon position. Further analysis of the composition of the third codon position in the amino acid sequences of protein-coding genes revealed that the majority of the mutations were synonymous. Among these, C-to-T or T-to-C mutations were the most frequent. This base substitution pattern is mechanistically similar to the processes of cytosine methylation and demethylation. Hybridization is known to induce changes in DNA methylation, which plays a crucial role in genome regulation and gene expression, thereby influencing growth, development, and phenotype [40,41]. In fact, epigenetic variation in DNA methylation may generate new allelic states that alter transcription, thereby providing a mechanism for phenotypic diversity in the absence of genetic mutations [42]. Epigenetic markers have been detected in multiple mitochondrial protein genes, and their methylation status is associated with tissue metabolism [43]. Although previous studies have explained the causes of phenotypic differentiation in the improved diploid carp lineages from the perspective of DNA (Hox gene family) sequence structural variation, likely due to factors such as genome recombination, duplications/deletions, alterations in the timing and level of gene expression, transposon activation, and epigenetic effects [30], our findings in this study provide new insights into the reasons for phenotypic differentiation in these lineages. Contrary to the expectation that offspring would most closely resemble their immediate parents, our analysis of the usage frequency of the twenty common proteinogenic amino acids showed that similarity was stronger within the same generation: IDC-F_1_ was most similar to IDMC-F_1_, and likewise for the IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M) and IDMC-F_2_. This observation supports the conclusions of prior studies focused on nuclear DNA (Hox gene family) sequence structural variation [30]. We hypothesize that due to the impact of distant hybridization shock effects, the hybrid offspring’s genome undergoes severe upheaval, leading to the generation of numerous mutations. A selection purging process may intervene in subsequent self-cross generations. To establish viable offspring under this hybridization impact, strong selective pressure activates a selective clearance mechanism within the mitochondria. This mechanism purges incompatible mtDNA genetic variants from the F_1_ generation, leading to rapid purification of the F_2_ mitochondrial genomes, a sharp decline in their genetic diversity, and ultimately the attainment of a new stable state. In addition, we found substantial differences in the usage frequencies of the three amino acids Ile, Met, and Trp between the IDC and IDMC lineages compared to COC, BSB, and RCC. Analysis of the third codon position revealed that A-to-G or G-to-A transitions constituted the majority of non-synonymous mutations. We therefore propose that the elevated Met usage likely resulted from an A-to-G substitution in the Ile codon ATA, converting it to ATG. Similarly, the increased Trp usage may have arisen from convergent mutations across multiple alternative codons. Collectively, these findings indicate a codon usage bias in the amino acid sequences of certain protein-coding genes.
Finally, analysis of the genetic variation in ten mitochondrial genes (structural regions) across different generations and among different individuals within the same generation of the IDC and IDMC lineages revealed that the mitochondrial genomes of IDC-F_1_ and IDMC-F_1_ did not adhere to a strict maternal inheritance mechanism. Instead, influenced by distant hybridization, they exhibited the phenomenon of paternal base insertions. In mitochondrial genes (structural regions) such as COI, ND4, and ND5, varying degrees of paternal base insertion were observed in both IDC-F_1_ and IDMC-F_1_. However, overall, the number of paternal base insertions was greater in IDC-F_1_ than in IDMC-F_1_ (calculated as the average number of paternal base insertions across different polymorphic individuals). Furthermore, in both IDC-F_1_ and IDMC-F_1_, the number and distribution of paternal leakage events within the same mitochondrial gene (structural region) varied significantly across different polymorphic individuals. Notably, in these mitochondrial genes (structural regions), a portion of the paternally inserted bases from IDC-F_1_ and IDMC-F_1_ were stably transmitted to IDC-F_2_ and IDMC-F_2_. Paternal mtDNA leakage is a documented phenomenon in numerous species, such as birds [44], fish [45], turtles [46], and frogs [47], as well as in plants including pine trees [48], geraniums [49], and peas [50]. Moreover, the vast majority of species exhibiting paternal leakage are of hybrid origin. Based on previous studies from our laboratory, we observed the phenomenon of paternal leakage in triploid crucian carp [24], in offspring from hybridization between female koi carp and male blunt snout bream [51], and in the homodiploid crucian carp-like fish [29]. Moreover, some paternal leakage fragments can be stably inherited in their self-bred offspring, eventually forming stable chimeric sequence segments within the established lineages. Compared to the dynamic mtDNA inheritance patterns identified in previous studies [24,29,51], our research on the IDC and IDMC lineages—which share the same hybrid origin (female common carp × male blunt snout bream)—reveals an even more dynamic picture of mtDNA inheritance: paternal mtDNA leakage exhibits significant variation not only between lineages but also among individuals. Comparative analysis indicates that even among hybrid lineages with identical parentage, the intensity, distribution, and intergenerational fate of paternal mtDNA leakage can vary significantly. The presence of paternal mtDNA can therefore affect genetic diversity and estimates of gene flow [19]. This has significant implications because it challenges the foundational assumption of strict maternal mtDNA inheritance, which underpins the use of mtDNA as a key molecular marker in phylogenetic inference [52], population genetics [53], and conservation [54]. Paternal leakage occurs to varying degrees during hybridization in many species. Studies of paternal mtDNA leakage in Drosophila have shown that interspecific hybridization exhibits a higher leakage rate compared to intraspecific crosses [55,56]. Studies in mice demonstrated that paternal mtDNA could only be detected under interspecific hybridization conditions. Furthermore, this leaked mtDNA was not ubiquitously distributed across all F_1_ hybrid tissues nor transmitted to subsequent generations via the female germline [57,58]. Dokianakis et al. found in Drosophila that paternal mtDNA leakage was more frequent in male than in female offspring. This suggests that paternal mtDNA leakage may not be a random outcome of an error-prone mechanism but rather occurs under complex genetic control [59,60], which further supports our hypothesis regarding the mitochondrial selective clearance mechanism. Some studies also suggest that reciprocal crosses within the same population may result in varying degrees of paternal leakage, particularly in the cross direction that produces fewer or lower-fitness offspring [61,62]. Furthermore, the extent of paternal leakage varies greatly among different species. For instance, in a specific interspecific hybridization of cicadas, 46% of the offspring exhibited paternal leakage [62]. Similarly, studies in Drosophila interspecific backcrosses and inter-population hybrids of potato cyst nematodes reported paternal leakage in 31–63% and 40% of offspring, respectively [20,63]. Furthermore, with the insertion of paternal bases, mutations occurred at some base sites in both IDC-F_1_ and IDMC-F_1_. Compared to IDMC-F_1_, IDC-F_1_ harbored more mutation sites (calculated based on the average number of mutated base sites among different polymorphic individuals). The majority of these mutation sites were consistent with those in RCC, suggesting that during the distant hybridization pathway between female common carp and male blunt snout bream, there may be a mutational shift toward crucian carp genetic components within the mitochondrial genome DNA. Similar to the pattern of paternal leakage, in both IDC-F_1_ and IDMC-F_1_, there was significant variation in the number of base mutations occurring within the same mitochondrial gene (structural region) among different polymorphic individuals, indicating that these base mutations were indeed influenced by paternal base insertion. Finally, our analysis revealed a marked contrast between the F_1_ and F_2_ generations. The mitochondrial genes (structural regions) of IDC-F_1_ and IDMC-F_1_ were genetically unstable, harboring multiple polymorphic sites and diverse variation patterns among individuals. In stark contrast, in both IDC-F_2_ and IDMC-F_2_, the number of polymorphic sites was drastically reduced, and the mtDNA genetic variation patterns exhibited among different individuals within the same population were also diminished. Our findings reveal that rapid structural variations occurred in the mitochondrial genomes of both IDC-F_1_ and IDMC-F_1_ derived from distant hybridization between female common carp and male blunt snout bream. While the mitochondrial genomes of these newly established diploid hybrid lineages exhibited instability, those of the IDC-F_2_ (including IDC-F_2_-C and IDC-F_2_-M) and IDMC-F_2_ showed a clear trend toward stabilization.
5. Conclusions
In this study, we analyzed the complete mitochondrial genome structure of two types of improved diploid carp lineages derived from distant hybridization between female common carp and male blunt snout bream. The results revealed instability in the F_1_ mtDNA of both improved lineages. However, the extent of structural variation in the mitochondrial genomes was substantially reduced and gradually stabilized in their respective self-cross offspring (F_2_). We speculate that to establish viable offspring under the impact of distant hybridization, strong selective pressure activates a purifying selection mechanism within the mitochondria. This mechanism purges incompatible mtDNA genetic variants from the F_1_ generation, leading to the rapid purification of the F_2_ mitochondrial genomes, a sharp decline in their genetic diversity, and ultimately the establishment of a new stable state. These findings hold significant implications for understanding mitochondrial genome evolution in distant hybrid species and for fish genetic breeding practices. Furthermore, the discovery of mitochondrial paternal leakage in both types of improved diploid carp lineages indicates that these lineages can serve as an ideal model for elucidating the regulatory mechanisms by which paternal mtDNA leakage influences hybrid species adaptability. Notably, we also found significant differences in the number and distribution of paternal base insertions within the same mitochondrial gene (structural region) among different polymorphic individuals in the F_1_ generation of both improved diploid carp lineages. This indicates that the stringency of the paternal mtDNA elimination mechanism varies markedly among polymorphic individuals across different hybrid lineage generations, reflecting the randomness of paternal leakage. Our findings not only provide novel evidence for the phenomenon of mitochondrial paternal leakage in animals but are also crucial for elucidating the underlying mechanisms of strict maternal mtDNA inheritance in species. Furthermore, they hold profound significance for investigating how the extent of paternal mtDNA leakage influences hybrid species adaptability and for revealing the potential molecular mechanisms.
The reference list from the paper itself. Each links out to its DOI / PubMed record.
- 1Mallet J. Hybridization as an invasion of the genome Trends Ecol. Evol.20052022923710.1016/j.tree.2005.02.01016701374 · doi ↗ · pubmed ↗
- 2Lewontin R.C. Birch L.C. Hybridization as a source of variation for adaptation to new environments Evolution 19662031533610.1111/j.1558-5646.1966.tb 03369.x 28562982 · doi ↗ · pubmed ↗
- 3Abbott R. Albach D. Ansell S. Arntzen J.W. Baird S.J. Bierne N. Boughman J. Brelsford A. Buerkle C.A. Buggs R. Hybridization and speciation J. Evol. Biol.20132622924610.1111/j.1420-9101.2012.02599.x 23323997 · doi ↗ · pubmed ↗
- 4Li D. Huang Z. Song S. Xin Y. Mao D. Lv Q. Zhou M. Tian D. Tang M. Wu Q. Integrated analysis of phenome, genome, and transcriptome of hybrid rice uncovered multiple heterosis-related loci for yield increase Proc. Natl. Acad. Sci. USA 2016113 E 6026 E 603510.1073/pnas.161011511327663737 PMC 5068331 · doi ↗ · pubmed ↗
- 5Fahlvik N. Rytter L. Stener L.-G. Production of hybrid aspen on agricultural land during one rotation in southern Sweden J. For. Res.20213218118910.1007/s 11676-019-01067-9 · doi ↗
- 6Bartley D.M. Rana K. Immink A.J. The use of inter-specific hybrids in aquaculture and fisheries Rev. Fish Biol. Fish.20001032533710.1023/A:1016691725361 · doi ↗
- 7Li Z. Li B. Tong Y. The contribution of distant hybridization with decaploid Agropyron elongatum to wheat improvement in China J. Genet. Genom.20083545145610.1016/S 1673-8527(08)60062-418721781 · doi ↗ · pubmed ↗
- 8Liu Q. Wang S. Tang C. Tao M. Zhang C. Zhou Y. Qin Q. Luo K. Wu C. Hu F. The Research Advances in Distant Hybridization and Gynogenesis in Fish Rev. Aquac.202517 e 1297210.1111/raq.12972 · doi ↗
