Complex evolution of East Asian Tertiary relict species revealed by the phylogeography of Lindera obtusiloba
Jun-Wei Ye, Rui Yang, Lei Bao, Meng-Jing Dai, Hong-Fang Wang, Jian-Ping Ge

TL;DR
The study explores the complex evolutionary history of Lindera obtusiloba in East Asia, revealing genetic patterns and migration events during the Tertiary and Quaternary periods.
Contribution
The study provides new insights into the phylogeography and diversification processes of a Tertiary relict species using nuclear low-copy genes.
Findings
Genetic diversity and population size were higher in low latitudes compared to high latitudes.
A genetic discontinuity between high and low latitudes suggests limited gene flow since the middle Pleistocene.
Both High- and Low-Latitude Origin scenarios were supported, but the Vicariance Origin was also indicated by ancestral area reconstruction.
Abstract
In East Asia, the evolutionary histories of Tertiary relicts are complex, involving not only southward retreats during the middle and late Tertiary but also repeated migrations northward or southward during the Quaternary. Three possible scenarios are hypothesized: High-Latitude Origin, Low-Latitude Origin, and Vicariance Origin. The phylogeography of Lindera obtusiloba is investigated using 27 independent nuclear low-copy genes to test these scenarios and explore key diversification processes. The genetic diversity and effective population size at low latitudes were approximately two/three times greater than those at high latitudes. A sharp genetic discontinuity with reciprocal monophyletic lineages and limited gene flow that diverged during the middle Pleistocene was revealed between high and low latitudes. The ancestral population was small, and demographic expansions occurred in…
Genes, proteins, chemicals, diseases, species, mutations and cell lines named across the full text — each resolved to its canonical identifier and authoritative record.
Click any figure to enlarge with its caption.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5- —Xingdian Talent Support Program
Peer Reviews
No public reviews on file for this paper yet. If you reviewed it on a platform where reviews are public (OpenReview, ICLR, NeurIPS, ICML), you can paste yours below so the community can read it here.
Videos
No videos yet. Explain this paper in a talk, walkthrough, or lecture? Add one.
Taxonomy
TopicsGenetic diversity and population structure · Evolution and Paleontology Studies · Plant Diversity and Evolution
Background
East Asia is characterized by high levels of biodiversity and endemism [1, 2]. Extreme physiographic and climatic heterogeneity, combined with the intensification of monsoons, provide long-term refugia for Tertiary relicts, which were once mostly widely distributed across the Northern Hemisphere [2–4]. The relict species experienced not only southward retreat during the climate cooling from the middle to late Tertiary [5, 6] but also repeated migrations northward or southward due to periodic climatic fluctuations during the Quaternary [7, 8]. Thus, their evolutionary histories are more complicated than those of species originating from low latitudes (e.g., tropical or subtropical regions).
The evolution of East Asian Tertiary relict species can be demonstrated by three distinct hypotheses. The Low-Latitude Origin (LLO) hypothesis suggests that the relict species initially diverged at low latitudes, with parts of the ancestral populations then migrating to higher latitudes [8]. A gradual decrease in genetic diversity with increasing latitude is predicted, and the genetic lineage of northern populations is nested within the southern lineage [7, 8]. Some phylogeographic case studies can be found [9, 10]. The definition and genetic predictions of the High-Latitude Origin (HLO) hypothesis are exactly the opposite of those for the LLO hypothesis. Evidence supporting the HLO scenario is primarily derived from phylogenetic studies, such as those on American oaks [11] and Meehania [12], while phylogeographic studies are limited, with Gugger et al. [13] being a rare example. Vicariance Origin (VCO) suggests that populations at high and low latitudes have diverged due to intermediate geographical barriers, such as the East China Sea (ECS) [8], or environmental barriers, such as the aridity belt [14]. These barriers likely result in similar levels of genetic diversity and reciprocal monophyletic lineages [8]. The dated divergence time would be older than the formation of the genetic barrier [8].
These three hypotheses can be evaluated in the context of East Asia’s two main refugial regions (NEA and SEA; see below) [14]. Understanding how Tertiary relicts are distributed across these areas allows us to test whether diversification primarily occurred in the north or south or through long-term vicariance between them. One refugia region is centered in Japan, Korea, and adjacent northeastern China (NEA), and the other is located in southern and southeastern China and extends into the Himalayas (SEA) [5]. Several Tertiary relict species (or species pairs) are distributed across both refugia regions, including Acer pictum, Kalopanax septemlobus, Lindera obtusiloba, Juglans mandschrica and J. cathayensis, Euptelea pleiosperma, and E. polyandra. The phylogeography of some species revealed the presence of independent clades in the NEA and SEA, with no clear pattern of “southern richness and northern poverty”. Examples include Juglans spp. [15], Euptelea [16], and L. obtusiloba [14]. The distinct clades can be associated with the redevelopment of the arid belt during the late Miocene to Pliocene epochs or the sea level changes of the ECS [14, 15]. Therefore, which origin scenario is the most suitable model for the evolution of the Tertiary relicts in East Asia remains unclear.
L. obtusiloba, a Tertiary species widely distributed in East Asia, presents a suitable case for testing the three origin hypotheses. First, our previous study revealed that the NEA and SEA L. obtusiloba populations were genetically isolated with limited gene flow through chloroplast DNA (cpDNA) and nuclear microsatellites (nSSRs), supporting the VCO scenario. Second, greater genetic diversity of nSSRs was detected in NEA. More abundant Miocene L. paraobtusiloba fossils, an analog of L. obtusiloba [17], have been found in NEA than in SEA, and the oldest fossil (dating back to the Paleocene) was discovered in northeastern China (approximately 50° N) [14]. Certain cold-tolerance traits, such as buds surrounded by leathery bracts [18], may have aided the species in surviving the cold climate in NEA after the Tertiary. These findings are evidence for the HLO scenario. Finally, more haplotypes and genetic diversity of cpDNA and greater morphological variation were shown in SEA [14], supporting the LLO scenario. Thus, whether the evolution of L. obtusiloba conforms to HLO, LLO or VCO needs further study.
In this study, we sequenced 27 nuclear low-copy genes (nLCGs) developed from transcriptome data [19]. Genetic structure and divergence time estimations were performed using Bayesian clustering and Bayesian phylogenetic inference based on the nLCGs. Ancestral area reconstruction was applied to infer possible ancestral population distributions. Different evolutionary scenarios representing HLO, LLO, and VCO with different demographic processes were designed and tested using the approximate Bayesian computation (ABC) model. Key parameters of demographic processes, such as effective population size (Ne) and gene flow, were also estimated. Potential distributions and differences in niches between SEA and NEA populations were analyzed. We intend to answer the following questions: (1) Whether the extant L. obtusiloba populations originate from HLO, LLO, or VCO? (2) What are the detailed diversification and demographic history of L. obtusiloba in the two regions?
Methods
Sampling and DNA extraction
We sampled 90 individuals of L. obtusiloba from 24 populations, with 12 populations each in the NEA and SEA regions, encompassing its entire distribution range. This includes 20 populations from Ye et al.’s study [14] and four additional populations: one from Xizang (BM) in China and three from Japan (UH, KI, and NI) (Table 1; Fig. 1). To minimize the collection of closely related individuals, the sampled individuals were spaced at least 30 m apart. Samples from the related species, L. erythrocarpa, were collected as an outgroup. The samples were dried and stored in silica gel prior to the extraction of total genomic DNA using a plant genomic DNA extraction kit (Tiangen, Beijing). Voucher samples (BNU23236–23259), identified by Hong-Fang Wang, were deposited at the Beijing Normal University Herbarium (BNU; Table S2).
Table 1. Genetic diversity of 27 nuclear low-copy genes for 24 populations of Lindera obtusilobaCodeRegion/PopulationLocationLatitudeLongitude n (sample size) Allelic richnessNucleotide diversity (× 10^− 3^)SEA region4512.043.521BMBomi, Xizang, China29.8795.7331.641.002PMAPianma, Yunnan, China25.9998.6641.691.303LAJLajing, Yunnan, China26.4999.2851.831.514WEIXWeixi, Yunnan, China27.1899.2931.851.615XZDXiaozhongdian, Yunnan, China27.3499.8431.731.206ANZHAnzihe Nature Reserve, Sichuan, China30.81103.1351.580.707MCSHMt. Micang, Shannxi, China32.69107.5331.901.458BDGSMt. Badagong, Hunan, China29.69109.7931.941.539LISHMt. Li, Shanxi, China35.43111.9831.610.8710DBSHMt. Daba, Anhui, China31.01116.1151.711.2911WYSHMt. Wuyi, Jiangxi, China27.93117.6931.631.0212TMSHMt. Tianmu, Zhejiang, China30.42119.4151.721.28NEA region456.061.9613YTSHMt. Yuntai, Jiangsu, China34.72119.4431.581.0214DALDalian, Liaoning, China38.90121.4631.431.1915KYSHMt. Kunyu, Shandong, China37.26121.7331.701.4316XRDZhuanghe, Liaoning, China40.02122.9651.661.5317BHSHBukhansan National Park, Seoul City, Korea37.65126.9931.791.1918ZYSHMt. Jiri, South Gyeongsang Province, Korea35.29127.4931.711.5419XYSHSeoraksan National Park, Gangwon Province, Korea38.17128.4931.571.5020JWSGariwangsan, Gangwon Province, Korea37.43128.5651.511.1021UHMasuda-shi Shimane-ken, Japan34.55132.0461.681.1022KIKawakami, Nagano-ken, Japan36.73138.1531.921.8523JAPTokyo, Japan35.95139.3061.891.8124NINikkō-shi, Tochigi-ken, Japan36.75139.4221.440.76
Fig. 1a, c Color-coded grouping of the 24 Lindera obtusiloba populations according to the most likely K = 2/4, as inferred by STRUCTURE, using 27 low-copy nuclear gene loci. b, d Histogram of the STRUCTURE assignment test for all populations at their most likely K = 2/4. The light orange and blue shadows represent the SEA and NEA regions of East Asian Tertiary relict flora, respectively
Nuclear DNA
Sequencing and neutrality tests
Twenty-seven nLCGs were sequenced from all the sampled L. obtusiloba individuals (Table 1; Ye et al. [19]). In L. erythrocarpa, only 15 loci were successfully sequenced. CodonCode Aligner 3.6.1 (https://www.codoncode.com/aligner/) with the CLUSTAL module was used to edit and align all sequences, and the PHASE function in DnaSP 5.10.01 [20] was utilized to phase heterozygous sequences. The newly obtained nLCGs were uploaded to GenBank (MF152421–MF152590). We assessed selective neutrality using the Ewens–Watterson test, Tajima’s D, and Fu’s FS, as calculated by Arlequin 3.5 [21].
Genetic diversity and structure
Measures of genetic diversity, including nucleotide diversity (π) and haplotype diversity (Hd) at the locus level, as well as genetic diversity (π and allelic richness, AR) at the population level, were calculated for the NEA and SEA regions using SPADS 1.0 [22]. The 27 loci were associated with different putative functions, as justified by the gene information of the closest gene [19]; thus, these loci were treated as independent loci in the subsequent genetic structure, phylogeny and demographic analyses. Population structure was assessed using a model-based Bayesian algorithm in STRUCTURE 2.3.4 [23] with an admixture model of population structure, and allele frequencies were assumed to be correlated among populations using a converted input file by SPADS 1.0 [22]. Ten independent runs were performed for each population (K) from 1 to 20 with 1 × 10^5^ Markov chain Monte Carlo (MCMC) steps of burn-in, followed by 1 × 10^6^ steps. The most likely number of clusters was determined by the change in the log-likelihood of the data, LnP(D), and the second-order rate of LnP(D) change, ΔK [24]. The average membership coefficients for the 10 simulation runs of a given K value were generated by CLUMPP 1.1.2 [25], and a graphical representation of the average membership coefficient for each individual was generated in Distruct 1.1 [26]. The geographical distributions of different clusters were displayed in ArcMap 10.2 (ESRI, Redlands, CA, USA). Pairwise genetic differentiation (FST) between genetic clusters was calculated using SPADS 1.0 with 1000 permutations.
Bayesian phylogeny and divergence time
The Bayesian phylogeny of the 24 populations and L. erythrocarpa was inferred using *BEAST [27] in BEAST 2.4 [28] with all partitions unlinked. The substitution models for each locus were chosen in jModelTest 2.0.1 [29]. Because no nLCG substitution rates have been published for Lindera or Lauraceae species, we estimated the nLCG substitution rate of L. obtusiloba by the suggested substitution ratio between cpDNA and nuclear DNA, which is estimated as 3:16 [30]. Based on fossil data and phylogenetic information, the cpDNA substitution rate of L. obtusiloba was estimated to be 8.59 × 10^− 10^, with a 95% highest probability density interval (HPD) of 3.12–23.80 × 10^− 10^ per site per year [14]. Hence, the estimated substitution rate of nLCG was 4.58 × 10^− 9^ (95% HPD: 1.66–12.69 × 10^− 9^) per site per year, which was applied to estimate coalescence time based on the Bayesian phylogeny.
The MCMC algorithm was set to run for 1 × 10^9^ steps, with sampling occurring every 1 × 10^4^ steps. The initial 20% of the samples were discarded as burn-in. The quality of the MCMC output was assessed by examining the effective sample size (ESS) value in Tracer 1.6 (http://tree.bio.ed.ac.uk/software/tracer). A maximum clade credibility tree was chosen from the sampled posterior distribution using the program TreeAnnotator 2.5.1 (part of the BEAST package). The tree topology and coalescence times were visualized in FIGTREE 1.4.2 (http://tree.bio.ed.ac.uk/software/figtree). Multiple runs of combinations of different clock models (strict clock or lognormal clock model) and different coalescent priors (Yule prior or coalescent prior) resulted in similar divergence times for all populations, the SEA or NEA populations, and the results with strict clock models and the Yule process were reported.
IMa analysis
We utilized the ‘isolation with migration’ (IM) coalescent model, as implemented in IMa2 [31], to estimate the Ne of NEA, SEA, and their common ancestral populations, the bidirectional migration rates (2Nm) and the divergence time (t) between the two regions. All the loci were analyzed using the HKY model, and the inheritance scalars were set to 1. The corresponding values of u were calculated using the formula u = µkg, where µ represents the substitutions per site per year (4.58 × 10^− 9^, 95% HPD: 1.66–12.69 × 10^− 9^), k is the sequence length of each locus, and g is the generation time, which is approximately 15 years in L. obtusiloba [14].
MCMC simulations were conducted using a geometric heating scheme (with the required two terms set to 0.99 and 0.75) across 60 chains. The burn-in phase was run for a sufficient duration to ensure that no perceivable trends were present in the trend plots. Subsequently, 2 × 10^6^ steps were executed to save 2 × 10^4^ genealogies. The posterior Hipt (highest bin value) and 95% HPD for the demographic parameters are reported. Only estimates whose posterior distribution fell to zero within the prior intervals were considered reliable. All individuals were included, and two independent runs were performed. The results of the second run are reported.
MsABC modeling
To determine whether L. obtusiloba originated through HLO, LLO, or VCO, the approximate Bayesian computation (ABC) algorithm was employed using msABC [32]. A split model was defined as the derived population formed from an expansion of a fraction of the ancestral population with minimal or no change in population size. In contrast, the vicariance model assumes that the population size of the ancestral populations is the sum of all derived populations. Six split and nine vicariance scenarios with different demographic parameters were constructed (see supporting file 1).
After simulation of 2 × 10^6^ steps in each model, model comparison and parameter estimation were conducted in the statistics package ‘abc’ in R 3.5.0 (https://www.r-project.org/). The multinomial logistic regression method (“mnlogistic” function) was used to infer the posterior probability of each demographic model with a tolerance rate of 0.01. For the best-supported model, the divergence time, Ne and gene flow were estimated using Neuronet’s method with nnet = 50. The median value and 95% HPD of all the estimations are reported. Two independent runs were conducted for msABC simulations, model selection and parameter estimation to ensure congruence, and the results of the second run are reported.
Ecological niche modeling and climatic differences
Ten uncorrelated climatic variables (r < 0.9 [14]), were used to model the current niches of the NEA (35 occurrences) and SEA (65 occurrences) populations through the maximum-entropy modeling technique (MaxEnt [33]). Potential distributions at the Last Glacial Maximum (LGM) were predicted with MIROC-ESM, an Earth System Model based on the Model for Interdisciplinary Research on Climate [34] and CCSM4, a Community Climate System Model, version 4 [35]. In ENMTools, a background test with 1000 pseudoreplicates based on the 10 climatic variables was used to test whether the ecological niche models (ENMs) generated from the two regions were identical [36]. Tests were performed in both directions (NEA vs. SEA and vice versa) by taking background samples from buffer zones (0.5°), defined based on their dispersal ability [37]. Niche similarity was quantified using both Schoener’s D and the standardized Hellinger distance (I) [36].
Nineteen climate variables, downloaded from WorldClim [38] at a 2.5-arcminute resolution, were extracted using Spatial Analyst Tools in ArcGIS 10.2 (ESRI, Redlands, CA, USA). A principal component analysis (PCA, using the ‘princomp’ function) was conducted in R. Isolation by environment (IBE) was assessed through partial Mantel tests (‘mantel.partial’ function) between the FST and climatic (Euclidean) distance, accounting for geographic distance in the form of a natural logarithm, which was based on 10,000 permutations. These analyses were conducted across all the populations, as well as in the SEA and NEA populations.
Ancestral area reconstructions
To reconstruct the geographical diversification, five geographic regions, Eastern Himalayan Province (A), Sikang–Yunnan Province (B), Central Chinese Province (C), North Chinese Province (D) and Japanese–Korean Province (E), were defined according to the floristic divisions [1]. BioGeoBEARS 1.1.1 [39] in R was applied based on the dated Bayesian phylogeny. An equal probability for exchange between different areas was applied, and the maximum number of areas was assigned to five. Six models (DEC, DEC + j, DIVALIKE, DIVALIKE + j, BAYAREALIKE, and BAYAREALIKE + j) were generated and compared based on the AICc weights.
Results
Genetic diversity and genetic structure
No loci significantly violated the neutrality assumption (Table S1). The sequence length ranged from 154 to 944 base pairs (bp), totaling 13,285 bp; the number of variable sites ranged from three to 71; Hd ranged from 0.56 to 0.97; and π ranged from 1.6 × 10^− 3^–8.56 × 10^− 3^ (Table S1). Among the 24 populations, the AR ranged from 1.44 to 1.90, and the π ranged from 0.70 × 10^− 3^–1.85 × 10^− 3^ (Table 1). Genetic diversity in the SEA region (AR = 12.04, π = 3.52 × 10^− 3^) was almost twice that in the NEA region (AR = 6.06, π = 1.96 × 10^− 3^), with an equal sample size (n = 45).
In the STRUCTURE analysis, ΔK had the highest value when K = 2 (Fig. S2), with the two clusters corresponding to the NEA and SEA division (Fig. 1a). The second highest ΔK occurred when K = 4, and lnP(D) also almost reached the highest value (Fig. 1b and S1). The four clusters formed four reciprocally monophyletic lineages (PP > 0.95) in the BEAST-derived phylogeny (Fig. 2). Greater genetic differentiation was detected between the two clusters in SEA (FST = 0.54) than between the two in NEA (FST = 0.39), and the greatest genetic differentiation was detected between clusters from NEA and SEA (FST = 0.75–0.79) (Table 2).
Fig. 2BEAST-derived chronograms of 24 populations of Lindera obtusiloba with outgroup L. erythrocarpa using 15 low-copy nuclear gene loci (the remaining 12 loci were not successfully sequenced in L. erythrocarpa) based on the estimated substitution rate, mean: 4.58 × 10^− 9^ (95% highest-probability-density interval, HPD, 1.66–12.69 × 10^− 9^). Black dots in nodes indicate high posterior probabilities (PP > 0.95). Median divergence time and 95% HPD in key nodes are labeled; all estimated divergence times were scaled in ka, a thousand years ago. Clades and populations in the NEA and SEA regions are colored in blue and orange, respectively. Four clusters in accord with STRUCTURE results (see Fig. 1) are shown in the same color
Table 2. Pairwise FST values between all four clusters of* Lindera obtusiloba* derived from Bayesian clustering and phylogeny using 27 nuclear low-copy genesClustersW-SEAE-SEAW-NEAE-NEAW-SEA0.00E-SEA0.540.00W-NEA0.780.790.00E-NEA0.750.770.390.00
Population demographic history
BEAST analysis revealed that the coalescence time for all the populations was 757.04 ka (95% HPD: 578.35–959.01 ka), with the populations in NEA and SEA coalescing at 63.22 ka (95% HPD: 38.43–92.94 ka) and 395.42 ka (95% HPD: 293.26–517.89 ka), respectively (Fig. 2). The IMa2 estimated a coalescence time at 798.61 ka (95% HPD: 577.16–96.49 ka). The estimated gene flow was low (Nm < 0.01), and the effective population size of SEA populations (NSEA = 2.4 × 10^4^) was approximately three times that of NEA populations (NNEA = 0.9 × 10^4^) and ten times that of ancestral populations (Table 3 and Fig. S3).
Table 3. Posterior Hipt (the Bin with the highest value) and 95% highest posterior density interval (HPD) of divergence time (t), effective population size and population migration rates (2Nm) estimations simulated by isolation by migration (IM) model using 27 nuclear low-copy genes of Lindera obtusiloba. ka, a thousand years agoParametert (ka)Effective population sizePopulation migration ratesSEANEAAncestralSEA→NEANEA→SEAHiPt798.6123,6368,8733,3770.0010.01795% HPD low577.1620,0246,51823600.00295% HPD high996.4927,72011,3869,6590.0290.052
MsABC exhibited a high posterior probability of originating from SEA (PP = 0.46) or NEA (PP = 0.34), whereas the other 13 models had probabilities below 0.04. Similar demographic parameters were estimated in the two scenarios. The effective population size of SEA (3.2 × 10^4^) was greater than that of NEA (1.2 × 10^4^), and both populations experienced severe shrinkage to similar sizes (approximately 2000) at similar ages (109 ka in NEA and 91 ka in SEA) after coalescence at 583 ka. Greater migration (Nm) from NEA to SEA (0.25 vs. 0.09) was estimated in the SEA origin scenario, whereas the opposite migration (0.21 vs. 0.10) was greater in the NEA origin scenario (Fig. 3 and Figs. S4-5).
Fig. 3. The two most possible scenarios, SEA origin (a) and NEA origin (b) in msABC. Posterior probabilities (PP) in different scenarios are shown. Divergence times (T, TSEA and TNEA) are scaled in ka, a thousand years ago. Gene flow (Nm) across the divergence history is labeled. Effective population sizes of ancestral population (NSEA’ and NSEA’) and extant populations (NSEA and NSEA) are shown
Climatic-induced differences
In the PCA, the first two axes accounted for 67.7% of the total variance, with Axis 1 explaining 41.8% and Axis 2 explaining 25.9% (see Table S2). PC1 and PC2 are primarily associated with precipitation and temperature, respectively (Table S2). The data in Fig. 5a suggest that populations from SEA and NEA inhabit distinct environmental and climatic niches, with the niches of the former being significantly larger. The IBE indicates potential ecological adaptation in PC2 (r = 0.22, P = 0.01), whereas there is no such indication in PC1 (r = 0.02, P = 0.22). Analyses using the 19 variables suggest potential adaptations to both precipitation and temperature (Table S3).
At present, the potential distribution in both regions suggests possible extensions into the other region. In NEA, extensive suitable habitat is predicted in the northern Japanese archipelago. During the LGM, the potential distribution of the NEA populations extended into SEA regions, whereas the SEA population retracted to lower latitudes (Fig. 4). Background tests indicated that the environmental conditions in the two regions were more divergent than anticipated (P < 0.05) when the habitat available to NEA populations is considered. In contrast, this divergence was not observed based on the availability of SEA populations (Figs. 5b&c).Fig. 4. Potential species distributions modelled through ten uncorrelated climatic variables (r < 0.9 [14]), of SEA (a-c) and NEA (d-f) populations of Lindera obtusiloba based on ecological niche modelling at present (a, d), during the Last Glacial Maximum using the CCSM4 model (b, e), and the MIROC-ESM model (c, f)Fig. 5a Principal component analysis (PCA) plots with 19 climatic variables (Table S2) of 100 Lindera obtusiloba occurrence points. Different colors correspond to the SEA and NEA region. b, c Niche background test plots for 10 non-correlated climatic variable between populations from NEA and SEA regions quantified by the standardized Hellinger distance (I) and Schoener’s D [36]. The vertical line in each plot represents the observed values (obs.) while the histograms represent those of null distributions
Ancestral area reconstructions
BioGeoBEARS revealed that the DEC + j model best explained the divergence of L. obtusiloba populations based on AICc (DEC = 53.4, DEC + j = 43.6, DIVALIKE = 49.9, DIVALIKE + j = 45.3, BAYAREALIKE = 84.7, and BAYAREALIKE + j = 50.1). The model suggested that the ancient (mid–Pleistocene) distribution of L. obtusiloba was probable in the BE (Sikang–Yunnan Province and the Korean peninsula and Japanese archipelago). A vicariance event between NEA (E) and SEA (B) occurred. Within SEA, the ancestral area was likely in southwest China (B), and colonization from southwest China (B) to the Eastern Himalayan Province (A) and Central Chinese Province (C) occurred. In NEA, the ancestral area was likely in the Korean peninsula and Japanese archipelago (E), and ancestral populations experienced westward migrations to the North Chinese Province (D) (Fig. S6).
Discussion
The complex origin history of L. obtusiloba in East Asia
To test which of the three hypotheses—High-Latitude Origin (HLO), Low-Latitude Origin (LLO), or Vicariance (VCO)—best explains the evolutionary history of Tertiary relict species in East Asia, we reconstructed the phylogeography of L. obtusiloba through ample nuclear gene- and climate-induced difference analyses. Overall, our results are most consistent with dual refugia and asymmetric persistence, with SEA acting as a long-term demographic reservoir and NEA persisting in smaller, environmentally constrained pockets rather than a single, unambiguous origin.
In SEA, a much greater current effective population size and genetic diversity were revealed than in NEA, consistent with the higher cpDNA diversity [14], supporting the LLO pattern [7, 8]. The SEA origin scenario in the MsABC approach also received the highest support. The environments in the two regions were more divergent than expected based on the habitats available to the NEA populations, indicating that L. obtusiloba preferred a humid and warm climate in subtropical regions [40]. HLO is supported mainly by fossil evidence. L. paraobtusiloba, a fossil species that closely resembles L. obtusiloba, had its oldest occurrence in NEA (dated to the Paleocene, approximately 50° N) and its youngest one in SEA (Yunnan Province, approximately 25° N) [14]. In the Miocene, more fossils were found in NEA than in SEA [14]. The potential distribution modelling also show the present NEA populations has possibilities to colonize northern Japanese archipelago with severer environment. The NEA origin scenario in msABC (PP = 0.34) provides additional modest support. The higher genetic diversity of nSSRs in the NEA region may have been caused by a limited number of loci (n = 6) [14]. However, the Bayesian phylogeny revealed two reciprocal monophyletic lineages, with no nested relationships, favoring the VCO scenario [8]. Ancestral area reconstruction further supported the VCO pattern, indicating long-term persistence in dual refugia (SEA and NEA regions). The best fit model (DEC + j) indicates that the founder-event process is also included, which are likely occurred in demographic history within the two regions [39]. The ABC model did not find evidence of population shrinkage during the LGM, which further supports that the NEA and SEA regions acted as dual refugia for L. obtusiloba. In NEA, cold tolerance characteristics, such as heavier seeds and later-maturing fruits [14], provide the possibility of long-term northern persistence [18] or may imply adaptations to the cooler climate (IBE analysis) of L. obtusiloba to the limited niche.
Although the three origin hypotheses are conceptually mutually exclusive, ancient divergence and subsequent asymmetric evolution with small effective population sizes (approximately 2000) may obscure clear genetic signals for dispersal direction (HLO, LLO, or VCO). Migrations between the NEA and SEA regions are likely facilitated by the exposure of the ECS seafloor since the late Miocene: 7.0–5.0 Ma, 2.0–1.3 Ma [16, 41], as the arid belt likely acted as a climatic barrier for genetic exchanges [14, 15]. Although a final ECS exposure occurred from 0.2 to 0.015 Ma [41], intensified aridity after the middle Pleistocene and possible adaptation to the environment of NEA populations may have prevented migration through the ECS [14], leading to long-term independent evolution of NEA and SEA populations.
Compared with our previous study, in which cpDNA (approximately 3 kb) and nSSRs (n = 6) were used [14], the 27 unlinked loci [19] provide much more information. The nLCG data reveal not only a sharp NEA–SEA genetic discontinuity but also a clear substructure and more detailed population demographic histories within each region (Figs. 1 and 2). However, the total length of nLCGs (approximately 13 kb) represents limited coverage of the total nuclear genome [42], limiting the ability to distinguish different evolution models (HLO, LLO and VCO).
Detailed diversification history of L. obtusiloba
The nuclear genes traced the divergence of NEA and SEA populations to the mid–Pleistocene Transition period (MPT, 0.7–1.25 Ma) [43], whereas Ye et al. [14] reported that divergence occurred during the late Pliocene using cpDNA and nSSRs. The limited migration of cpDNA (via seeds) compared with that of nuclear genes (via seeds and pollen) should result in a larger effective population size and thus a longer coalescence time [44]. Incomplete lineage sorting and randomness in the coalescence process will result in serious overestimation of the divergence time in a gene tree (cpDNA) compared with that of a species tree (multiple nLCGs) [44]. Uncertainty in mutation models of nSSRs [44] and estimations without considering gene flow may cause an earlier estimate of the divergence time in nSSRs [14]. The MPT was characterized by alternation from mostly symmetric cycles with periods of approximately 41 ka to strongly asymmetric 100-ka cycles [43], and changes in climate may have isolated populations in various refugia, leading to divergence between NEA and SEA populations [45].
The SEA mountainous regions provided long-term refugia for Tertiary relict species [46]. Greater physiographic and climatic heterogeneity in SEA offered wider environmental niches (Fig. 5), combined with the intensification of monsoons in SEA, resulting in greater genetic diversity and a longer evolutionary history in SEA than in NEA [2–4]. Isolation in different mountains also caused greater genetic differentiation and more detailed genetic structures [14]. The detailed west–east substructure corresponds well to the Tanaka–Kaiyong Line, an important phytogeographic boundary in Southwest China [47]. Ancestral populations have survived in the Hengduan Mountains, and populations in the westernmost and central–east regions have been formed through migrations of these ancestral populations, which is consistent with the eastward migration revealed by the cpDNA and nSSR data [14].
In NEA, L. obtusiloba survived the unsuitable climate during the LGM in the Korea Peninsula and Japanese Archipelago. Long-term survival in the NEA region has also been reported for Asian butternuts [15], Cercidiphyllum [48], and Euptelea [16], indicating that northern migrorefugia are shared among different Tertiary relict species [8]. Ancestral area reconstruction revealed that populations in North China are formed through eastward postglacial migrations, and the gradually decreased cpDNA haplotype diversity with decreasing longitude provide further evidence [14]. However, compared with the SEA region, a more severe climate and a narrower niche would cause a faster loss of alleles, resulting in a shorter evolutionary history and lower genetic diversity [7, 8]. Lower genetic differentiation with ambiguous genetic substructure (Figs. 1 and 2) may be caused by lower physiographical and climatic heterogeneity and higher connectivity [8].
In the NEA and SEA regions, demographic expansions were both found after the last interglacial, which is in line with the prediction of glacial contraction and interglacial expansions during the Quaternary [7, 8]. Most relict species experienced multiple cycles of demographic expansion and contraction during the late Tertiary/Quaternary periods, e.g., Tetracentron sinense [49], Davidia involucrata [50], and Ginkgo biloba [51], whereas some species continued to decline after the LGM [50, 51]. L. obtusiloba populations in both SEA and NEA remained stable during and after the LGM, suggesting that Quaternary climate fluctuations, particularly the LGM, had a minor effect on their demographic histories.
Conclusions
The evolution of East Asian Tertiary relict species is complex, as none of the three proposed origin scenarios are fully supported, and the diversification patterns vary between regions in L. obtusiloba. Analyses based on a limited number of neutral nuclear loci (27 loci, approximately 13 kb) may limit the ability to distinguish complex demographic scenarios with low levels of migration. Genomic data with richer genetic information and adaptation loci are likely to help elucidate the origin and adaptation history of L. obtusiloba. Some other Tertiary relict species also exhibit independent long-term persistence in northern populations, e.g., Asian butternuts [15], Cercidiphyllum [48], and Euptelea [16]. An integrated study of these species will lead to a deeper understanding of the complex evolutionary history of East Asian Tertiary relict species.
Supplementary Information
Supplementary Material 1. Supporting file S1 Different origin scenarios simulated in msABC and according codes.
Supplementary Material 2. Table S1 Genetic diversity and neutrality test in 27 nuclear low-copy genes of Lindera obtusiloba. Figure S1 The five major floristic divisions, Eastern Himalayan Province (A), Sikang–Yunnan Province (B), Central Chinese Province (C), North Chinese Province (D) and Japanese–Korean Province (E), in East Asia according to Wu and Wu [1]. Figure S2 Delta-K and LnP(D) values from the STRUCTURE analysis on 24 Lindera obtusiloba populations using 27 nuclear low-copy genes with predefined group number K = 1–20. Standard deviations of LnP(D) obtained from 10 independent runs for each group number are also shown. Figure S3 IMa2 analysis results of Lindera obtusiloba populations using 27 low-copy nuclear genes. Posterior probability distributions of divergence time (t) between NEA and SEA populations (a) and effective population size of NEA, SEA and Ancestral populations (b) are illustrated. Figure S4 Posterior probability distributions of estimated demographic parameters in SEA origin (expansion) scenario. 1 and 2 represent populations in SEA and NEA, respectively. Figure S5 Posterior probability distributions of estimated demographic parameters in NEA origin (expansion) scenario. 1 and 2 represent populations in SEA and NEA, respectively. Table S2 Voucher numbers for 24 sampled populations of Lindera obtusiloba. Table S3 Bio-climatic variables, standardized loading for the two first axes of the principal component analysis (PCA) (present climate) and result of isolation by environment (IBE) analysis. Figure S6 Ancestral area reconstructions derived by DEC+j model usging BioGeoBEARS base on the BEAST-derived chronograms of 24 populations of Lindera obtusiloba with outgroup L. erythrocarpa (ZHGS) using 15 low-copy nuclear gene loci (the remaining 12 loci were not successfully sequenced in L. erythrocarpa) based on the estimated substitution rate, mean: 4.58 × 10-9 (95% highest-probability-density interval, HPD, : 1.66–12.69 × 10-9). Five geographic regions, Eastern Himalayan Province (A), Sikang–Yunnan Province (B), Central Chinese Province (C), North Chinese Province (D) and Japanese–Korean Province (E), were defined according to the floristic divisions [1].
